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ACTIVIN RECEPTOR-LIKE KINASES, PROTEINS HAVING 
SERINE THREONINE KINASE DOMAINS AND THEIR USE. 
*★*★★*★★**** 

Yi$l* the Invention 

This invention relates to proteins having 
5 serine/ threonine kinase domains, corresponding nucleic acid 
molecules , and their use. 
fracVorround of the Invention 

The transforming growth factor-B (TGF-B) superfamily 
consists of a family of structurally-related proteins, 

10 including three different mammalian isof orms of TGF-B (TGF- 
Bl, B2 and B3), activins, inhibins, mtillerian-inhibiting 
substance and bone morphogenic proteins (BKPs) (for reviews 
see Roberts and Sporn, (1990) Peptide Growth Factors and 
Their Receptors, Pt.l, Sporn and Roberts, eds. (Berlin: 

15 Springer - Verlag) pp 419-472; Moses et fil (1990) Cell £2, 
245-247) • The proteins of the TGF-B superfamily have a 
wide variety of biological activities. TGF-B acts as a 
growth inhibitor for many cell types and appears to play a 
central role in the regulation of embryonic development, 

20 tissue regeneration, immuno-regulation, as well as in 
fibrosis and carcinogenesis (Roberts and Sporn (199) s e 
above) • 

Activins and inhibins were originally identified as 
factors which regulate secretion of follicle-stimulating 

25 hormone secretion (Vale e£ £l (1990) Peptide Growth Factors 
and Their Receptors, Pt.2, Sporn and Roberts, eds. (Berlin: 
Springer-Verlag) pp. 211-248). Activins were also shown to 
induce the differentiation of haematopoietic progenitor 
cells (Murata £t fil (1988) Proc. Natl. Acad. Sci. USA ££, 

30 2434 - 2438; Eto e£ fil (1987) Biochem. Biophys. Res. 
Commun. 142, 1095-1103) and induce mesoderm formation in 
Xenopus embryos (Smith fit si (1990) Nature 2A1, 729-731? 
van den Eijnden-Van Raaij &1 (1990) Nature 241, 732- 
734). 

35 BKPs or osteogenic proteins which induce the formation 

of bone and cartilage when implanted subcutaneously (Wozn y 
p.t si (1988) Science 242, 1528-1534), facilitate neuronal 
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differ ntiation (ParalXar fit fil (1992) J. Cell Biol. 112, 
1721-1728) and induce monocyt chemotaxis (Cunningham fit Al 
(1992) Proc. Natl. Acad. Sci. USA 1£, 11740-11744). 
Httllerian-inhibiting substance induces regression of the 
5 Miillerian duct in the .ale reproductive system (Cate fit fil 
" (1986) cell 4j>, 685-698), and a glial cell line-derived 
neurotrophic factor enhances survival of midbrain 
dopaminergic neurons (Lin fit fil (1993) Science 1130- 
1132) . The action of these growth factors is mediated 
10 through binding to specific cell surface receptors. 

Within this family, TGF-B receptors have been most 
thoroughly characterized. By covalently cross-lin3cing 
radio-labelled TGF-B to cell surface molecules followed by 
polyacrylamide gel electrophoresis of the affinity-labelled 
15 complexes, three distinct size classes of cell surface 
proteins (in most cases) have been identified, denoted 
receptor type I (53 Xd) , type II (75 Xd) , type III or 
betaglycan (a 300 xd proteoglycan with a 120 xd core 
protein) (for a review see Massague (1992) Cell §1 1067- 
20 1070) and more recently endoglin (a hoaodimer of two 95 Xd 
* * subunits) (Cheifetz fit fil (1992) J. Biol. Che*. 267 19027- 
19030) . Current evidence suggests that type I and type II 
receptors are directly involved in receptor signal 
transduction (Segarini fit fil (1989) Mol. Endo., 2, 261-272; 

25 Laiho fit fil (199D J - Biol « Chem * 910 °- 9112) and " y 

form a heteroaeric complex; the type II receptor is needed 
for the binding of TGF-B to the type I receptor and the 
type I receptor is needed for the signal transduction 
induced by the type II receptor (Wrana fit fil (1992) Cell, 

30 71 1003-1004) . The type III receptor and endoglin may 
nave more indirect roles, possibly by facilitating the 
binding of ligand to type II receptors (Wang fit fil (1991) 
Cell, 12 797-805; Lopez-Casillas fit fil (1993) Cell, 22 
1435-1444) . 

35 Binding analyses with activin A and BKP4 have led to 

" J the identification of two co-existing cr ss-link d affinity 
complexes of 50-60 JcDa and 70-80 XDa on responsive cells 
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(Hino fit fil (1989) J. Biol. Chen. 2L6A# 10309 - 10314; 
Mathews and Val (1991), Cell £4, 775-785; ParaDcer fit Ai 

(1991) Proc. Natl. Acad. Sci. USA fiZ, 8913-8917), By 
analogy with TGF-B receptors they are thought to be 

5 signalling receptors and have been named type I and type II 
receptors. 

Among the type II receptors for the TGF-fl superf aaily 
of proteins, the cDNA for the activin type II receptor (Act 
RII) was the first to be cloned (Mathews and Vale (1991) 

10 Cell ££, 973-982) . The predicted structure of the receptor 
was shown to be a transmembrane protein with an 
intracellular serine/threonine kinase domain. The activin 
receptor is related to the c. eleaans sifll-1 gene product , 
but the ligand is currently unknown (Georgi fit al (1990) 

15 Cell £1, 635-645) . Thereafter, another form of the activin 
type II receptor (activin type IIB receptor) , of which 
there are different splicing variants (Mathews fit fil 

(1992) , Science 221, 1702-1705; Attisano fit fli (1992) Cell 
68 . 97-108) , and the TGF-B type II receptor (TBRII) (Lin fit 

20 al (1992) Cell ££, 775-785) were cloned, both of which have 
putative serine /threonine kinase domains. 
Summary of thS Invention 

The present invention involves the discovery of 
related novel peptides, including peptides having the 

25 activity of those defined herein as SEQ ID Nos. 2, 4, 8, 
10, 12, 14, 16 and 18. Their discovery is based on the 
realisation that receptor serine/threonine kinases form a 
new receptor family, which may include the type II 
receptors for other proteins in the TGF-B super family. To 

30 ascertain whether there were other members of this family 
of receptors, a protocol was designed to clone ActRII/jiaf, 
I related cDNAs. This approach made use of the polymerase 
chain reaction (PCR) , using degenerate primers based upon 
the amino-acid sequence similarity between kinase domains 

35 of the mouse activin type II receptor and daf -I gene 
products • 
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This strategy resulted in the isolation f a nev 
family f receptor kinases call d fcctivin receptor like 
kinases (ALK's) 1-6. These cDKAs showed an overall 33-39% 
sequence similarity vith ActRII and TGF-B type II receptor 
5 and 40-92% sequence similarity towards each other in the 
kinase domains. 

Soluble receptors according to the invention comprise 
at least predominantly the extracellular domain. These can 
be selected from the information provided herein, prepared 
10 in conventional maimer, and used in any manner associated 
vith the invention. 

Antibodies to the peptides described herein may be 
raised in conventional manner. By selecting unique 
sequences of the peptides, antibodies having desired 
15 specificity can be obtained. 

The antibodies may be monoclonal, prepared in known 
manner. In particular, monoclonal antibodies to the 
extracellular domain are of potential value in therapy. 
Products of the invention are useful in diagnostic 
20 methods, e.g. to determine the presence in a sample for an 
analyte binding therewith, such as in an antagonist assay. 
Conventional techniques, e.g. an enzyme-linked 
immunosorbent assay, may be used. 

Products of the invention having a specific receptor 
25 activity can be used in therapy, e.g. to modulate 
conditions associated vith activin or TGF-0 activity. Such 
conditions include fibrosis, e.g. liver cirrhosis and 
pulmonary fibrosis, cancer, rheumatoid arthritis and 
glomeronephr it is . 
30 prief Desc ription of the Drawings 

Figure 1 shows the alignment of the serine/threonine 
(S/T) kinase domains (I-VIII) of related receptors from 
transmembrane proteins, including embodiments of the 
present invention. The nomenclature of the subdomains is 
35 accordingly to Hanks et al (1988). 

Figures 2A to 2D shows the sequences and 
- characteristics of th respective primers used in th 
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initial PCR reactions* Th nucleic acid 6 quences ar also 
given as SEQ ID N s. 19 to 22* 

Figur 3 is a comparison of the amino-acid sequences 
of human activin type II receptor (Act R-II) , mouse activin 
5 type IIB receptor (Act R-IIB) , human TGF-B type II receptor 
(TBR-II) , human TGF-B type I receptor (ALK-5) , human 
activin receptor type IA (ALK-2) , and type IB (ALK-4) , ALKs 
1 & 3 and mouse ALK-6. 

Figure 4 shows , schematically, the structures for Eal- 
10 1, Act R-II, Act R-IIB, TBR-II, TBR-I /ALK-5, ALK's -1, -2 
(Act RIA) , -3, -4 (Act RIB) 4-6. 

Figure 5 shows the sequence alignment of the cysteine- 
rich domains of the ALKs, TBR-II, Act R-II, Act R-IIB and 
<?af -l receptors. 
15 Figure 6 is a comparison of kinase domains of 

serine/ threonine kinases, showing the percentage amino-acid 
identity of the kinase domains. 

Figure 7 shows the pairwise alignment relationship 
between the kinase domains of the receptor serine/threonine 
20 kinases. The dendrogram was generated using the Jotun-Hein 
alignment program (Hein (1990) Meth. Enzymol. 626- 
645). 

pfief Description of the Sequence Listings 

Sequences 1 and 2 are the nucleotide and deduced 
25 amino-acid sequences of cDNA for hALK-1 (clone HP57) . 

Sequences 3 and 4 are the nucleotide and deduced 
amino-acid sequences of cDNA for hALK-2 (clone HP53)* 

Sequences 5 and 6 are the nucleotide and deduced 
amino-acid sequences of cDNA for hALK-3 (clone ONF5) . 
30 Sequences 7 and 8 the nucleotide and deduced amino- 

acid sequences of cDHA for hALK-4 (clone 11H8), 
complemented with PCR product encoding extracellular 
domain. 

Sequences 9 and 10 are the nucleotide and deduced 
35 amino-acid sequences of cDNA for hALK-5 (clone EMBLA) • 

Sequ nces 11 and 12 are the nucleotide and deduced 
amino-acid sequences of cDNA for mALK-1 (clone AM6) . 
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Sequenc s 13 and 14 are th nucl otide and d duced 
amino-acid sequences of cDNA for mALK-3 (cl n s ME-7 and 
ME-D) . 

Sequences 15 and 16 are the nucleotide and deduced 
5 amino-acid sequences of cDNA for ttALK-4 (clone 8al) • 

Sequences 17 and 18 are the nucleotide and deduced 
amino-acid sequences of cDNA for mALK-6 (clone HE-6) . 

Sequence 19 (Bl-S) is a sense primer, extracellular 
domain, cysteine-rich region, BamHI site at 5' end, 28-mer, 
10 64 -fold degeneracy. 

Sequence 20 (B3-S) is a sense primer, kinase domain 
II, BamHI site at 5' end, 25-mer, 162-fold degeneracy. 

Sequence 21 (B7-S) is a sense primer, kinase domain 
VIB, S/T kinase specific residues, BamHI site at 5' end, 
15 24-mer, 288-fold degeneracy. 

Sequence 22 (E8-AS) is an anti-sense primer, kinase 
domain, S/T kinase-specif ic residues EcoRI site at 5' end, 
20-mer, 18-fold degeneracy. 

Sequence 23 is an oligonucleotide probe. 
20 Sequence 24 is a 5' primer. 

Sequence 25 is a 3' primer. 

Sequence 26 is a consensus sequence in Subdomain I. 
Sequences 27 and 28 are novel sequence motifs in 
Subdomain VIB. 

25 Sequence 29 is a novel sequence motif in Subdomain 

VIII. 

Description of the Invention 

As described in more detail below, nucleic acid 
sequences have been isolated, coding for a new sub-family 

30 of serine/threonine receptor kinases. The term nucleic 
acid molecules as used herein refers to any sequence which 
codes for the murine, human or mammalian form, amino-acid 
sequences of which are presented herein. It is understood 
that the well known phenomenon of codon degeneracy provides 

35 for a great deal of sequence variation and all such 
varieties ar included within the scope of this invention. 
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Th nu 1 ic acid s qu nces described her in may be 
U8 d to clone th r spective genomic DNA sequ nc » in rder 
to study the genes' structure and regulati n. Th murin 
and human cDNA or genomic sequences cen also be used to 
; isolate the homologous genes fro* other mammalian species. 
The mammalian DHA sequences can be used to study the 
receptors' functions in various injdto> and Uusto model 

SySte Is' exemplified below for AI*-5 cDNA, it is also 
9 recognised that, given the sequence information provided 
herein, the artisan could easily combine the molecules with 
a pertinent promoter in a vector, so as to produce a 
cloning vehicle for expression of the molecule. The 
promoter and coding molecule must be operably linked 2^ 
« anv of the well-recognized and easily-practised 
" Bethodologies for so doing. The resulting vectors, as well 
as the isolated nucleic acid molecules themselves, may be 
used to transform prokaryotic cells (e.g. £. soil) . or 
transf ect eukaryotes such as yeast (£. cerevisia*) , 
> 0 COS or CHO cell lines. Other appropriate expression 
systems will also be apparent to the skilled artisan. 

Several methods may be used to isolate the ligands for 
the ALKs. As shown for ALK-5 cDNA, cDNA clones encoding 
the active open reading frames can be subcloned into 
25 expression vectors and transf ected into eukaryotic cells, 
for example COS cells. The transf ected cells which can 
express the receptor can be subjected to binding assays for 
ralioactively-labelled members of the TGF-B 
(TGF-B, activins, inhibins, bone morphogenic proteins and 
30 mUllerian-inhibiting substances), as it may be expected 
that the receptors will bind members of the TGF B 
superfamily. Various biochemical or cell-based assays can 
be designed to identify the ligands, in tissue extracts or 
conditioned media, for receptors in which a ligand is not 
35 known. Antibodies raised to th recept ors «l^b used 
to identify the ligands, using th immunoprecipitation f 
- the cross-link d complexes. Altemativ ly, purxf i d 
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a «.« 4e lat the ligands using an 
r6C ptor c uld b used to UUt t ^ ^ th 

affinity-bas d approach. Th «• 

expression patterns of the ^J*^* carried out 
isolation of the Ugand. Thee. ■^«£ ftB> ^ 
5 using ALK DNA or SNA sequences as probes P 

hybridisation studies. structural studies 

w ,rious model systems or strucv«* 
The use of various mouei * - Bec if ic agonists 

enooid .nab!, th. rational d.v.lop« It 

with the receptors. M . Mle8 of the invention 

ssrr js= « — «- ■— - 

v rat rabbit and monkey, 

as mouse, human, rat, raoo* relates to specific 

*«iiowina description relates 
, „ts It b. understood that th. .pacification 

20 .^isants. It ■«* [ but not liBit ativ. of th. 

and exasple. are m« ^ii.ent. within th. 

present inv.ntion and that other ^ 
spirit and scope of th. invention will sugg 
t0 those SK111.4 in th. art- I ^ a n 

is JZ fro. a - 2^ Co^cTn 

(M CC TIB «01 i. » and TGF-B. Koreover 

30 shown to ^ sources f „ tte 

iauxaanic calls hav« pr (Partan.n St U 

clonin, of -^ "^"^ USA SI. ..»-»» - <»»> 
,»,.) Proc. »tl. Wrt m VM prep „. d 

Mol. Call. Biol. li. !«»» »° 7 > 1 (chirgwin si il 

by the guanidinio. isothioc, » thod^ 

(lfTt) Biochemistry li. *™ ■ ^ m 

using th poly-h or poly M trac 
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or mega, Madison. Hisconsin. U.8.X0 as described by the 
ianufacturars, or P»rifi.d through an Ugo <«)-=eU»los 
"lumn as daaeribad by Aviv and lad r (1972, Proc. Ml. 
r=ar S =i. USA SI. 1408-1412. Tha i.Olatad mBHA v.. u«d 
5 tb. synthesis of random primed (Amereham) cDHA. that 

wa . „aad to meKe a XgtlO library with M0> independent 
M clona. using tha Rlboelon. cDKA synthesxs ayate» 
plega, end IgtlO i^ilr* paOc.gin, Kit ,A.ar*b», 
Lording to th. .anuf.ctur.ra- procedures. An amplified 
,„ oltgo (dT) primed h«, pl.cnt. A2APII cDHA library of 
" W ind.p.nd.nt clona. was ua.d. Poly (A) PHA isolated 
£ ro« AG1518 human foreskin fibroblasts was usad to prepare 
I primary random prin.d AZAPII cDNA library of 1.5x10 
Jap.nd.nt clon.s using tb. RiboClone cDHA synthesis 
15 ty.t» and CigapacK Odd II packaging extract (Stratagane, 
Z addition, a primary oligo (dT, primed human farcin 
fibroblast IgtlO cDHA library (Cla.sson-K.lab «fc *1 (1989) 
Proc. Satl. Acad. Sci. USA. Si 4917-4912) va. prepared. An 
™Lfi.d oligo (dT) primad HEL cell igtll CDHA library of 
,0 H * 3 Indlp.nd.nt clones (Pones * uX (1987) Blood 

^-223 va. us.d. A tv.lv.-d.y mous. .mbryo UO, cDKA 
library vas obtained fro. Bovagen (Hadison, Wisconsin, 
Is.Z, • »»us. Placenta 1ZAPII cDHA library va. also 



used. 



25 



30 



35 



,- n-H-n "f 7™" Probe* BY fCB 

For th. generation of COMA probe, by PC* (I*. St *1 
(1988) science 2n. 1»8-1291> d.,.n.rat. PCR P^«» 
lon.trueted based upon tb. amino-.cid s.gu.nc. similarity 
r.tve« the mouse activin type II receptor (Hathevs and 
^ (1991) call Jl. 973-982) and fisl-1 (Caorga si H 
» ,0, cVll 835-8.5, in th. Xin.se domain. II and VIII. 
i g u e i show, the aligned a.rin./thr.onin. Kin... ^domaUs 
(I -viII>. of four related receptors of the Tor 6 
lerfamily. i- hTBR-II. mAotP-IIB, mAc«-II and the ft*- 
r,.na P»d-, using tha nomenclature of the subdomain. 
according to Hanks si si (1988) Science ail, 45-52. 
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5 veral c nsiderations ver appli d in the design of 
the PCR primers. The sequenc s ver taken from regions of 
homology between the activin type 11 receptor and the daf -i 
gene product, with particular emphasis on residues that 
5 confer serine/threonine specificity (see Table 2) and on 
residues that are shared by transmembrane kinase proteins 
and not by cytoplasmic kinases. The primers were designed 
so that each primer of a PCR set had an approximately 
similar GC composition, and so that self complementarity 

10 and complementarity between the 3' ends of the primer sets 
were avoided. Degeneracy of the primers was kept as low as 
possible, in particular avoiding serine, leucine and 
arginine residues (6 possible codons) , and human codon 
preference was applied. Degeneracy was particularly 

15 avoided at the 3' end as, unlike the 5' end, where 
mismatches are tolerated, mismatches at the 3' end 
dramatically reduce the efficiency of PCR. 

In order to facilitate directional subcloning, 
restriction enzyme sites were included at the 5' end of the 

20 primers, with a GC clamp, which permits efficient 
restriction enzyme digestion. The primers utilised are 
shown in Figure 2. Oligonucleotides were synthesized using 
Gene assembler plus (Pharmacia - LKB) according to the 
manufacturers instructions. 

25 * The mRNA prepared from EEL cells as described above 

was reverse-transcribed into cDNA in the presence of 50 mM 
Txis-HCl, pH 8.3, 8 mM MgCl 2 , 30 mM KC1, 10 mM 
dithiothreitol, 2mM nucleotide triphosphates, excess oligo 
(dT) primers and 34 units of AHV reverse transcriptase at 

30 42°C for 2 hours in 40 til of reaction volume. 
Amplification by PCR was carried out with a 7.5% aliquot (3 
/xl) of the reverse-transcribed mRNA, in the presence of 10 
mM Tris-HCl, pH 8.3, 50 mM KC1, 1.5 M MgCl 2 , 0.01% gelatin, 
0.2 mM nucleotide triphosphates, 1 pM of both sense and 

35 antisense primers and 2.5 units of Tag polymerase (Perkin 
Elmer Cetus) in 100 pi reaction volume. Amplifications 

^ were performed on a thermal cycler (Perkin Elmer Cetus) 
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using the following program: first 5 thermal cycl s with 
denaturation for 1 minute at 94°C, ann aling for 1 minute 
at 50°C, a 2 minute ramp to 55°C and elongation for 1 minute 
at 72°C, followed by 20 cycles of 1 minute at 94°C, 30 
5 seconds at 55°C and 1 minute at 72°C. A second round of PGR 
was performed with 2 pi of the first reaction as a 
template. This involved 25 thermal cycles, each composed 
of 94°C (1 min), 55°C (0.5 min), 72°C (1 min) • 

General procedures such as purification of nucleic 

10 acids , restriction enzyme digestion, gel electrophoresis, 
transfer of nucleic acid to solid supports and subcloning 
were performed essentially according to established 
procedures as described by Sambrook £t fll, (1989), 
Molecular cloning: A Laboratory Manual, 2 nd Ed. Cold Spring 

15 Harbor Laboratory (Cold Spring Harbor, New York, USA). 

Samples of the PCR products were digested with BamH I 
and EsoRl and subsequently fractionated by low melting 
point agarose gel electrophoresis. Bands corresponding to 
the approximate expected sizes, (see Table 1: »460 bp for 

20 primer pair B3-S and E8-AS and * 140 bp for primer pair B7- 
S and E8-AS) were excised from the gel and the DNA was 
purified. Subsequently, these fragments were ligated into 
pUC19 (Yanisch-Perron £i il (1985) Gene 21$ 103-119), which 
had been previously linearised with EajaHI and £cfiRl and 

25 transformed into £. coli strain DH5a using standard 
protocols (Sambrook £t &1, pupra ) . Individual clones were 
sequenced using standard double-stranded sequencing 
techniques and the dideoxynucleotide chain termination 
method as described by Sanger ££ al (1977) Proc. Natl. 

30 Acad. Sci. USA 24, 5463-5467, and T7 DNA polymerase. 

Employing Reverse Transcriptase PCR on HEL mRNA with 
the primer pair B3-S and E8-AS, three PCR products were 
obtained, termed 11.1, 11.2 and 11.3, that corresponded to 
novel genes. Using the primer pair B7-S and EB-AS, an 

35 additional novel PCR product was obtained termed 5.2. 
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15 Jsolation of cDNA Clones 

The PCR products obtained were used to screen various 
cDNA libraries described supra . Labelling of the inserts 
of PCR products was performed using random priming method 
(Feinberg and Vogelstein (1983) Anal. Biochem, 122L 6-13) 

20 using the Kegaprime DNA labelling system (Amersham) . The 
oligonucleotide derived from the sequence of the PCR 
product 5.2 was labelled by phosphorylation with T4 
polynucleotide kinase following standard protocols 
(Sambrook £fc £l, supra 1 . Hybridization and purification of 

25 positive bacteriophages were performed using standard 
molecular biological techniques. 

The double-stranded DNA clones were all sequenced 
using the dideoxynucleotide chain-termination method as 
described by Sanger g£ £l, pupra r using T7 DNA polymerase 

30 (Pharmacia - LKB) or Sequenase (U.S. Biochemical 
Corporation, Cleveland, Ohio, U.S.A.). Compressions of 
nucleotides were resolved using 7-deaza-GTP (U.S. 
Biochemical Corp.) DNA sequences were analyzed using the 
DNA STAR computer program (DNA STAR Ltd. U.K.). Analyses 

4s of the sequences obtained revealed the existence of six 
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dl .tl»ct putative r c ptor serine/threonln Kinases which 
MV . be n ne.ee ^ tte oll , 0 ((JT) pr i» d human ' 

n« XJry was screened with e radiolabeled 
placenta eDKA library was scr tteir 

5 insert derived from the ™ ^ ee Afferent 

r .,trictlon enrym. *^estion pettern^, thr 
typ es or done, with 8PP rotate in»rt si^ 

, vh t 2 5 jtb were identified. Tne < 

2 W> * w . , -this class and 

„PS7. was chosen as representative of ^ o£ 

10 subjected to complete serein,. J^enc' ^ 

T^»^"^- reliln9 £rMe 

P s . orotan of 503 .mino-acids, with hi,h serenes 
CT laritv to "captor serine/tbreonine Kinases (see 
"! T The first methionine codon, tbe putatrve 
15 below). The « nu= ieotid. J82-285 and is 

'""'/ed* TLlZi stop oodon. Tbis first *T 0 is in 
Ho", f avourTbl' context for translation initiation ( KoraX 
a more favour 8 i 2 5-8148) than the second and 

(198 „ Hucl. Acxd. Res., U. MM M > JJ7 _ 
tbira in-frame XTO at nucleotides J« 3" untrtnll . t . d 

-t: s^t^'tSHL^s: <8o* Co. which 

sequence of 282 nucx rece ptors (Kozak (1991) 

- - — m".^) - ulansl'ated sequence 
" "ises W n^e ides and ends with a poly-, tail Ho 
rf fin poly-A addition signal is found, but there is a 

* , . . _ nMX library with a 

fdTl primed human placenta cDNA 11Drar * 

derived from transcripts f the sam gene. 
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reveal d a sequence f 2719 nuci ^ ^ 

- Xon,- open >„ 4 _ 106 a9tee . 

favourably with KoiaJC ^ ^ ^ 

POSitl °: r :' « *« cod'on. in close proxi-ity t urth.r 
There are lour ATG cons en.u. sequence 

dovn.tr.™. which a9re. with cn . cumi „, BO dal the 

(«"•*■ -J*. "ZZXtVZ «anslation .tart sit., 
first MG is pr.dict.d to be th nueleotlde ,. Th. J' 

Tne untranslated ^cZ**- contain. . 

untran.lat.d * "° nucleotide, upstrean 

polyad.nylation .«£ ^^"^ w iae*s «8 
fron the poly-* toU. »• ^ ^ but tte 

, nucleotides fro- the 5 end P nucleotifles and 

seT ,ence emended et the 3 end ^ 

poly - A tail is absent. northern blots, 

poly .d.nyl.tion site. «« : tor belM) . 

h cwever, only on. cloned w initially 

The cDMX tor hun n ** J 

screening an oligo (dl) pr ^ (S£Q „ Ho . „, 

^ 'Tn J. PCK Prooucl 5.2. On. positive cDHX don. 
darived fro. th. PC* prod ^ UtBtlfi-> 

" Cn' partial * .PP.ar.d that this 

i5 However. it Yncod.s only part of th. Kinase 

clone was «=°»P le "' 1 doBai ». The »o«t S< 

— * ^/ThU a MO nucleotide restriction 
sequence of OHli, at , d donain. was 

fr.gn.nt encoding a trun ibroblast com 

» ^ton ^ on? - cl" . -i^b « - - » 

Ub Tj.d 0S« was isolatad (SE Q ID Ho. 5). S.quene. 
ttl tamed 0HF6, cf JMJ nuoleoti4 ., 

analysis of OH« that this clon. was 

without a poly-* tail, gg rea(Jij>9 

" aerlV 1^rrXwS« anlno-acids. Th. first 
ircodon wlich L conpatibl. with KosaX-s consensu. 
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sequence (K zak, supra 1 , is at 310-312 nucleotides and is 
prec d d by an in-fram stop codon. Th 5 ' and 3' 
untranslated sequences are 309 and 1027 nucleotides long, 
respectively. 

5 ALK-4 cDNA was identified by screening a human oligo 

(dT) primed human erythroleukemia cDNA library with the 
radiolabeled insert of the PGR product 11.1 as a probe. 
One cDNA clone, termed 11H8, was identified with an insert 
size of 2 kb (SEQ ID Ho. 7). An open reading frame was 

10 found encoding a protein sequence of 383 amino-acids 
encoding a truncated extracellular domain with high 
similarity to receptor serine/threonine kinases. The 3' 
untranslated sequence is 818 nucleotides and does not 
contain a poly-A tail, suggesting that the cDNA vas 

15 internally primed. cDNA encoding the complete 

extracellular domain (nucleotides 1-366) was obtained from 
HEL cells by RT-PCR with 5' primer (SEQ ID No. 24) derived 
in part from sequence at translation start site of SKR-2 (a 
cDNA sequence deposited in GenBank data base, accesion 

20 number L10125, that is identical in part to ALK-4) and 3' 
primer (SEQ ID No. 25) derived from 11H8 cDNA clone. 

ALK-5 was identified by screening the random primed 
HEL cell Jtgt 10 cDNA library with the PGR product 11. 1 as 
a probe. This yielded one positive clone termed EMBLA 

25 (insert size of 5.3 kb with 2 internal JESfiRI sites). 
Nucleotide sequencing revealed an open reading frame of 
1509 bp, coding for 503 amino-acids. The open reading 
frame was flanked by a 5' untranslated sequence of 76 bp, 
and a 3' untranslated sequence of 3.7 kb which was not 

30 completely sequenced. The nucleotide and deduced amino- 
acid sequences of ALK-5 are shown in SEQ ID Nos. 9 and 10. 
In the 5' part of the open reading frame, only one ATG 
codon' was found; this codon fulfils the rules of 
translation initiation (Kozak, pupra ) . An in-frame stop 

35 codon was found at nucleotides (-54) -(-52) in the 5' 
untranslated region. The predicted ATG start codon is 
followed by a stretch of hydrophobic amino-acid residues 
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which has characteristics f a cleavable signal s quence. 
Theref r , the first ATG codon is likely to be used as a 
translation initiation site* A preferred cleavage site for 
the signal peptidase, according to von Heijne (1986) Kucl. 
5 Acid. Res. 2A$ 4683-4690, is located between amino-acid 
residues 24 and 25* The calculated molecular mass of the 
primary translated product of the ALK-5 without signal 
sequence is 53,646 Da* 

Screening of the mouse embryo JLEX lax cDNA library 

10 using PCR, product 11.1 as a probe yielded 20 positive 
clones. DNAs from the positive clones obtained from this 
library were digested with ficfiRI and Hinfilll, 
electrophoretically separated on a 1.3% agarose gel and 
transferred to nitrocellulose filters according to 

15 established procedures as described by Sambrook ££ al , 
supra . The filters were then hybridized with specific 
probes for human ALK-1 (nucleotide 288-670} , ALK-2 
(nucleotide 1-581), ALK-3 (nucleotide 79-824) or ALK-4 
nucleotide 1178-1967) . Such analyses revealed that a clone 

20 termed ME-7 hybridised with the human ALK-3 probe. 
However, nucleotide sequencing revealed that this clone was 
incomplete, and lacked the 5' part of the translated 
region. Screening the same cDNA library with a probe 
corresponding to the extracelluar domain of human ALK-3 

25 (nucleotides 79-824) revealed the clone KE-D. This clone 
was isolated and the sequence was analyzed. Although this 
clone was incomplete in the 3' end of the translated 
region, ME-7 and KE-D overlapped and together covered the 
complete sequence of mouse ALK-3. The predicted amino-acid 

30 sequence of mouse ALK-3 is very similar to the human 
sequence; only 8 amino-acid residues differ (98* identity; 
see SEQ ZD No. 14) and the calculated molecular mass of the 
primary translated product without the putative signal 
sequence is 57,447 Da. 

35 Of the clones obtained from the initial library 

screening with PCR product 11. i, four clones hybridized to 
the probe corresponding to the conserved kinase domain f 
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ALK-4 but not to probes from more divergent parts f ALK-1 
to -4. Analysis of thes clones r v aled that they hav an 
id ntical squnc which differs from those of ALK-1 to -5 
and was termed ALK-6. The longest clone ME6 with a 2.0 Xb 
insert was completely sequenced yielding a 1952 bp fragment 
consisting of an open reading frame of 1506 bp (502 amino- 
acids), flanked by a 5' untranslated sequence of 186 bp, 
and a 3' untranslated sequence of 160 bp. The nucleotide 
and predicted amino-acid sequences of mouse ALK-6 are shown 
in SEQ ID Hos. 17 and 18. No polyadenylation signal was 
found in the 3' untranslated region of ME6, indicating that 
the cDNA was internally primed in the 3' end. Only one ATG 
codon was found in the 5' part of the open reading frame, 
vhich fulfils the rules for translation initiation (Kozak, 
SUBra), and was preceded by an in-frame stop codon at 
nucleotides 163-165. However, a typical hydrophobic leader 
sequence was not observed at the N terminus of the 
translated region. Since there is no ATG codon and 
putative hydrophobic leader sequence, this ATG codon is 
likely to be used as a translation initiation site. The 
calculated molecular mass of the primary translated product 
with the putative signal sequence is 55,576 Da. 

Mouse ALK-1 (clone AM6 with 1.9 kb insert) was 
obtained from the mouse placenta 1ZAPII cDNA library using 
human ALK-1 cDNA as a probe (see SEQ ID Ko. 11). Mouse 
ALX-4 (clone Sal with 2.3kb insert) was also obtained from 
this library using human ALK-4 cDNA library as a probe (SEQ 
ID No. 15). 

To summarise, clones HP22, HP57, ONF1, ONF3, ONF4 and 
30 HP29 encode the same gene, ALK-1. Clone AM6 encodes mouse 
ALK-1. HP53, HP64 and HP 8 4 encode the same gene, ALK-2. 
ONF5, ONF2 and ON11 encode the same gene ALK-3. ME-7 and 
ME-d' encode the mouse counterpart of human ALK-3. 11H8 
encodes a different gene ALK-4, whilst 8al encodes the 
nouse equivalent. EMBLA encodes ALK-5, and ME-6 encodes 
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25 



35 



ALK-6. 
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The sequ nee alignment between the 6 ALK genes and 
TBR-II, mActR-II and ActR-IIB is sh vn in Figure 3. These 
molecules have a similar domain structure; an N-tenainal 
predicted hydrophobic signal sequence (von Heijne (1986) 
5 Nucl. Acids Res. 1£: 4683-4690) is followed by a relatively 
small extracellular cysteine-rich ligand binding domain, a 
single hydrophobic transmembrane region (Kyte £ Doolittle 
(1982) J. Mol. Biol, 151, 105-132) and a C-terminal 
intracellular portion, which consists almost entirely of a 

10 kinase domain (Figures 3 and 4). 

The extracelluar domains of these receptors have 
cysteine-rich regions, but they show little sequence 
similarity; for example, less than 20% sequence identity is 
found between Daf-1. ActR-II, TBR-II and ALK-5. The ALXs 

15 appear to form a subfamily as they show higher sequence 
similarities (15-47* identity) in their extracellular 
domains. The extracellular domains of ALK-5 and ALK-4 have 
about 29* sequence identity. In addition, ALK- 3 and ALK-6 
share a high degree of sequence similarity in their 

20 extracellular domains (46% identity). 

The positions of many of the cysteine residues in all 
receptors can be aligned, suggesting that the extracellular 
domains may adopt a similar structural configuration. See 
Figure 5 for ALKs-1,-2,-3 S- 5. Each of the ALKs (except 

25 ALK-6) has a potential N-linked glycosylation site, the 
position of which is conserved between ALK-1 and ALK-2, and 
between ALK-3, ALK-4 and ALK-5 (see Figure 4). 

The sequence similarities in the kinase domains 
between iJal-l, ActR-II, TBR-II and ALK-5 are approximately 

30 40%, whereas the sequence similarity between the ALKs 1 to 
6 is higher (between 59% and 90%; see Figure 6). Pairwise 
comparison using the Jutun-Hein sequence alignment program 
(Hein (1990) Meth, Enzymol., 1£2, 626-645), between all 
family members, identifies the ALKs as a separate subclass 

35 among serine/threonin kinases (Figure 7). 

The catalytic domains of kinases can be divided into 
12 subdomains with stretches of conserved amino-acid 
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r sidues. Th key motifs are found in serine /thre nine 
kinase r ceptors suggesting that they are functional 
kinases. Th consensus s guence for th binding of ATP 
(Gly-X-Gly-X-X-Gly in subdomain I followed by a Lys residue 
5 further downstream in subdomain XI) is found in all the 
ALKs. 

The kinase domains of flaf-1. ActR-II, and ALKs show 
approximately equal sequence similarity with tyrosine and 
serine/threonine protein kinases. However analysis of the 

10 amino-acid sequences in subdomains VI and VIII, which are 
the most useful to distinguish a specificity for 
phosphorylation of tyrosine residues versus 
serine/threonine residues (Hanks e£ (1988) Science 241 
42-52) indicates that these kinases are serine/threonine 

15 kinases; refer to Table 2. 
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TABLE 2 



UKASE 


SUB DOMAINS 




VIB 


VIII 


Serine/threonine kinase consensus 


DLKPEN 


C (T/S) XX 
(V/F) X 


Tyrosine Kinase consensus 


DIAARN 


XP(I/V) 
(K/R) W 
(T/K) 

\ # / 


Act R-II 


DIKSKN 


GTRRYM 


Act R-IIB 


DFKSKN 


GTRRYM 


T6R-II 


DLKSSN 


GTARYM 


ALK-I 


DFKSRN 


GTKRYM 


ALK -2, -3, -4, -5, & -6 


DIKSKN 


GTKRYM 



The sequence motifs DLKSKN (Subdomain VIB) and GTKRYM 
(Subdomain VIII) , that arc f ound in most of the 
serine/threonine kinase receptors, agree veil with the 

15 consensus sequences for all protein serine/threonine kinase 
receptors in these regions. In addition, these receptors, 
except for ALK-1, do not have a tyrosine residue surrounded 
by acidic residues between subdomains VII and VIII, which 
is common for tyrosine kinases. A unique characteristic of 

20 the members of the ALK serine/threonine kinase rec pt r 
family is th presenc of two short inserts in th kinase 
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vn . n * viB and betw en 

re,i r; tTi " . £ «-* ^-' ent betweeB faally 

terminal tail, are on ^ sequenee 

s members (see rigures 3 ana 

for TGF-B and 

similarity «ith ^^J^^TLains of »I» -1 to 
^era^e^Ver^r, Ser-S, 7 . cln-soo, Gln-498 
and Ser-497. respectively. 

10 EEia rrrr'° st ° n - _ 4 vas determined 

Tne distribution of *^ • f ^ ^ 

by Horthern blot »^ fr0B 

di£ !rr c aT The filter, ver. hybridised 
Clontech (Palo Mto, C.X.) . ove „ igh t in 50* 

" " ith ?n" d?" st P a r ntrd saline citrate (SSC, IxSSC is 
formaldehyde 5 x standa ^ sDS( M - 

*™ ""Z£? - x oelardt-s solution and 0.1 m,/ml 
sodium phosphate, 5 x " order to BinlB ije cross- 
salmon sperm DHX. that did not encode part of 
20 hybridisation, « the highly diverged 
tte ^..e ^^."Xnslated end ligand-bindin, 
sequences of eitner = ^ ^ jf untranslated 
regions (probes for MX-l. " lBbelleQ by 
sequence, (probe for ^ „ e9 a-prime) DKX 
25 random priming using (Feinberg ( vogel.tein 

iabelling /T.^,,. ^incorporated label vas 

( „W, Anal. Biochem JU- ^ y . mt . r . ver. 

removed by ephad" 0» ^ ^ ^ ^ ^ ^ 

washed at 65 C, twice tor ^ ^ gDg beferft 

30 and twice for stripP ing of blots was 

being exposed to X ray « c in water for 20 

performed by incubation at 90 

minutes. distribution were determined 

The ALK-5 mRNA size ^ fragm ent of 

35 by Northern blot cDNA clon , corresponding to 

98 0bp fth fulllngthA^S ^ ^ jf 

the C-terminal part of 
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untranslated r gion (nucleotides 1259-2232 in SEQ ID No. 9) 
vas us d as a probe. The filter vas washed twice in 0.5 x 
SSC, 0.1% SDS at 55°C for 15 minutes. 

Using the probe for ALK-1. two transcripts of 2.2 and 
5 4.9kt> were detected. The ALK-1 expression level varied 
strongly between different tissues, high in placenta and 
lung, moderate in heart, muscle and kidney, and low (to not 
detectable) in brain, liver and pancreas. The relative 
ratios between the two transcripts were similar in most 

10 tissues; in kidney, however, there was relatively more of 
the 4.9 kb transcript. By reprobing the blot with a probe 
for ALK-2, one transcript of 4.0 kb was detected with a 
ubiquitous expression pattern. Expression was detected in 
every tissue investigated and was highest in placenta and 

15 skeletal muscle. Subsequently the blot was reprobed for 
ALK-3. One major transcript of 4.4 kb and a minor 
transcript of 7.9 kb were detected. Expression was high in 
skeletal muscle, in which also an additional minor 
transcript of 10 kb was observed. Moderate levels of AI^-3 

20 mRNA were detected in heart, placenta, kidney and pancreas, 
and low (to not detectable) expression was found in brain, 
lung and liver. The relative ratios between the different 
transcripts were similar in the tested tissues, the 4.4 kb 
transcript being the predominant one, with the exception 

25 for brain where both transcripts were expressed at a 
similar level. Probing the blot with ALK-4 indicated the 
presence of a transcript with the estimated size of 5.2 kb 
and revealed an ubiquitous expression pattern. The results 
of Northern blot analysis using the probe for ALK-5 showed 

30 that a 5.5 kb transcript is expressed in all human tissues 
tested, being most abundant in placenta and least abundant 
in brain and heart. 

The distribution of mRNA for mouse ALK-3 and -6 in 
various mouse tissues was also determined by Northern blot 

35 analysis. A multiple mouse tissue blot was obtained from 
Clontech, Palo Alto, California, U.S.A. The filter was 
hybridized as described above with probes for mous ALK-3 
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„ ^ The EsflRI-Efitl r striction fragment, 

Sacl-HE&I fragment, corresponding t nucle ua 
£s£ A ***** ' . filter vas washed at 65 C 

*Tir-6 were used as probes. The filter 

were ^ «- n.1% SDS and twice for 

twice for 30 minutes in 2.5 x SSC, o.i* =» 
twice tor DS and then subjected to 

30 minutes with 0.3 x ssc, u.x* 

autoradiography. transcript 
.sing the probe for £ »* with the 

vas found only in spleen. By reprobi » 

"gnalTas seln\n the other tissues tested, i.e. heart, 

«v e ietal muscle, kidney and testis. 
U in detect" transcript si«s vere different, and thus 
cr o s reaction between nPHXs for the different U*s vas 

r.n «. ---- — r^rrr.: 

.. transcripts is unknown at present; they »ay he 

, t^d hy atretnativ. nPHX splicing, differentia! 
formed by dit£ere nt pronotors, or by a 

P li;atl n ^es. events. Differences in splicing 
I! T reTion, coding for the extracellular donains nay 
J f to tte synthesis of receptor, with different 
l..d to the t „ nAetR-XXB 

- a tr.rr^- 9 ^- - isclsticn of 
acid sequence, coding for new tally of bunen 
" tor kL.." The CDHX for M*-S was then used to 
i0 r^n. !coded protein sis. end binding properties. 

^<.s of Ma MZS 6BM TT--*" ^.talns 
EI2ES ^ u ^rpr^rtie. of the proteins encoded, >y the 
different U* CDHXS. the cDNA for each «S was subclone* 
1! a euxaryotic expression vector and transfected into 
35 into a euxaryo iw ^ ^ 4 to 

T—recipitation using a rabbit antiserum raised .gainst 
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a synthetic peptide corresponding to part of the 
intracellular juxtamembrane r gion. This r gion is 
divergent in sequence between the various serine/threonine 
kinase receptors . The following amino-acid residues were 
5 used: 

ALK-1 145-166 
ALK-2 151-172 
ALX-3 181-202 
ALK-4 153-171 

10 AUC-5 158-179 
ALK-6 151-168 

The rabbit antiserum against ALK-5 was designated VPN. 
The peptides were synthesized with an Applied 
Biosystems 430A Peptide Synthesizer using t-butoxycarbonyl 

15 chemistry and purified by reversed-phase high performance 
liquid chromatography. The peptides were coupled to 
keyhole limpet haemocyanin (Calbiochea-Behring) using 
glutaraldehyde, as described by Guillick ££ &1 (1985) EKBO 
J. £, 2869-2877. The coupled peptides were mixed with 

20 Freunds adjuvant and used to immunize rabbits. 
Transient transfection of the ALK-5 cDNA 

C0S-1 cells (American Type Culture Collection) and the 
R mutant of MvlLu cells (for references , see below) were 
cultured in Dulbecco's modified Eagle's medium containing 

25 10% fetal bovine serum (FBS) and 100 units/ml penicillin 
and 50 /ig 1ml streptomycin in 5% C0 2 atmosphere at 37°C 
The ALK-5 cDNA (nucleotides (-76) - 2232) , which includes 
the complete coding region, was cloned in the pSV7d vector 
(Truett £lr (1985) DNA £, 333-349), and used for 

30 transfection. Transfection into COS-1 cells was performed 
by the calcium phosphate precipitation method (Wigler g£ &1 
(1979) Cell 1£ # 777-785). Briefly, cells were seeded into 
6-well cell culture plates at a density of 5xl0 5 
cells/well, and transfected the following day with 10 yg of 

35 recombinant plasmid. After overnight incubation, cells 
were washed thr times with a buffer containing 25 mM 
Tris-HCl, pH 7.4, 138 mM NaCl, 5 mM KC1, 0.7 mM CaCl 2 , 0.5 



25 

mM MgClj and 0.6 mM Na^HPC^, and th n incubated with 
Dulb cco's modified Eagle's medium containing FBS and 
antibiotics. Tw days after transfection, th cells were 
metabolically labelled by incubating the c lis f r 6 hours 
5 in methionine and cysteine-free HCDB 104 medium with 150 
jiCi/ml of [^S] -methionine and [^S] -cysteine (in vivo 
labelling mix; Amersham) . After labelling, the cells were 
washed with 150 mM NaCI, 25 mM Tris-HCl, pH 7.4, and then 
solubilized with a buffer containing 20mM Tris-HCl, pH 7.4, 
10 150 mM NaCI, 10 mM EDTA, 1% Triton X-100, 1% deoxycholate, 
1.5* Trasylol (Bayer) and 1 mM phenylmethylsulfonylf luoride 
(PMSF; Sigma) . After 15 minutes on ice, the cell lysates 
were pelleted by centrifugation, and the supernatants were 
then incubated with 7 jil of preimmune serum for 1.5 hours 
15 at 4°C. Samples were then given 50 pi of protein A- 
Sepharose (Pharmacia-LKB) slurry (50* packed beads in 150 
mM NaCI, 20 mM Tris-HCl, pH 7.4, 0.2* Triton X100) and 
incubated for 45 minutes at 4 C C. The beads were spun down 
by centrifugation, and the supernatants (1 ml) were then 
20 incubated with either 7 jxl of preimmune serum or the VPN 
antiserum for 1.5 hours at 4 e C. For blocking, 10 jig of 
peptide was added together with the antiserum. Immune 
complexes were then given 50 nl of protein A-Sepharose 
(Pharmacia - LKB) slurry (50* packed beads in 150 mM NaCI, 
25 20mM Tris-HCl, pH 7.4, 0.2* Triton X-100) and incubated for 
45 minutes at 4°C. The beads were spun down and washed 
four times with a washing buffer (20 mM Tris-HCl, pH 7.4, 
500 mM NaCI, 1* Triton X-100, 1* deoxycholate and 0.2* 
SDS) , followed by one wash in distilled water. The immune 
30 complexes were eluted by boiling for 5 minutes in the SDS- 
sample buffer (100 mM Tris-HCl, pH 8.8, 0.01* bromophenol 
blue, 36* glycerol, 4* SDS) in the presence of 10 mM DTT, 
and analyzed by SDS-gel electrophoresis using 7-15* 
polyacrylamide gels (Blobel and Dobberstein, (1975) J. Cell 
35 Biol. £L, 835-851). Gels were fixed, incubated with 
Amplify (Amersham) for 20 minutes, and subjected to 
f luorography. A compon nt of 53Da was seen. This 
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component was not seen when pr immune serum vas used, r 
wh n 10 ixg blocking peptide vas added t gether with th 
antiserum. Moreover, it vas not d tectabl in sampl s 
derived from untransf ected COS-1 cells using either 
5 pre immune serum or the antiserum, 
pjqestion with Endoglycosidase F 

Samples immunoprecipitated vith the VPN antisera 
obtained as described above vere incubated vith 0.5 U of 
endoglycosidase F (Boehringer Mannheim Biochemica) in a 

10 buffer containing 100 mM sodium phosphate, pH 6.1, 50 mM 
EDTA, 1* Triton X-100, 0.1* SDS and 1% B-mercaptoethanol at 
37°C for 24 hours. Samples vere eluted by boiling for 5 
minutes in the SDS-sample buffer, and analyzed by SDS- 
polyacrylamide gel electrophoresis as described above. 

15 Hydrolysis of N-linked carbohydrates by endoglycosidase F 
shifted the 53 kDa band to 51 kDa. The extracelluar domain 
of ALK-5 contains one potential acceptor site for N- 
glycosylation and the size of the deglycosylated protein is 
close to the predicted size of the core protein. 

20 Establishment of PAE Cell L ines Expressing ALK-5 

In order to investigate vhether the ALK-5 cDNA encodes 
a receptor for TGF-B, porcine aortic endothelial (PAE) 
cells vere transf ected vith an expression vector containing 
the ALK-5 cDNA, and analyzed for the binding of I-TGF-B1. 

25 PAE cells vere cultured in Ham's F-12 medium 

supplemented vith 10% FBS and antibiotics (Miyazono s£ al. , 
(1988) J. Biol. Chem. 2£1, 6407-6415). The ALK-5 cDNA vas 
cloned into the cytomegalovirus (CHV) -based expression 
vector pcDNA I/NEO (Invitrogen) , and transf ected into PAE 

30 cells by electroporation. After 48 hours, selection vas 
initiated by adding Geneticin (G418 sulphate; Gibco - BRL) 
to the culture medium at a final concentration of 0.5 mg/ml 
(Westermark g£ fill., (1990) Proc. Natl. Acad. Sci. USA £1, 
128-132) . Several clones vere obtained, and after analysis 

35 by immunoprecipitation using the VPN antiserum, one clone 
denoted PAE/TBR-1 vas chosen and further analyzed. 
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lod i nat i on of 101=31. Binding and Affinity CgassllfllO ng 

Rec mbinant human TGF-fll was iodinated using the 
chloramine T method aec rding t Frolik e£ aj,., <i 98 4) J 
Biol. Che*. 251, 10995-11000. Cross-linking experiments 
5 were performed as previously described (Ichijo & 

(1990) Exp. cell Res. jjz, 263-269). Briefly, cells in 6- 
well plates were washed with binding buffer (phosphate- 
buffered saline containing 0.9 mM CaCl 2 , 0.49 mK HgCl 2 and 
1 mg/ml bovine serum albumin (BSA) ) , and incubated on ice 
10 in the same buffer with 125 I-TGF-Bi in the presence or 
absence of excess unlabelled TGF-B1 for 3 hours. Cells 
were washed and cross-linking was done in the binding 
buffer without BSA together with 0.28 mM disuccinimidyl 
suberate (DSS; Pierce Chemical Co.) for 15 minutes on ice 
15 The cells were harvested by the addition of i ml of 
detachment buffer (10 mM Tris-HCl, pH 7.4, 1 mM EDTA, 10% 
glycerol, 0.3 mM PMSF) . The cells were pelleted by . 
centrifugation, then resuspended in 50 m of solubilization 
buffer (125 mM NaCl, 10 mM Tris-HCl, p H 7.4, 1 mM EDTA 1% 
20 Triton X-100, 0.3 mM PMSF, 1% Trasylol) and incubated for 
40 minutes on ice. cells were centrifuged again and 
supernatants were subjected to analysis by SDS-gel 
electrophoresis using 4-15% polyacrylamide gels, followed 
by autoradiography. 12s i-TGF-£l formed a 70 kDa cross- 
25 linked complex in the transfected PAE cells (PAE/TBR-i 
cells) . The size of this complex was very similar to that 
of the TGF-B type I receptor complex observed at lower 
amounts in the untransfected cells. A concomitant increase 
of 94 kDa TGF-B type II receptor complex could also be 
30 observed in the PAE/TBR-I cells. Components of 150-190 
kDa, which may represent crosslinked complexes between the 
type I and type II receptors, were also observed in the 
PAE/TBR-I cells. 

In order to determine whether the cross-linked 70 kDa 
35 complex contained the protein encoded by the ALK-5 cDNA, 
the affinity cross-linking was followed by 
^ immunoprecipitati n using the VPN antiserum. For this, 



WO 94/11502 (~ 



C PCT/GB93/02367 



28 

cells in 25 cm 2 flasks were used. Th supernatants 
obtain d after cross-linking were incubated with 7 /il of 
preimmune senna or VPN antiserum in th presence or abs nee . 
of 10 ng of peptide for 1.5h at 4°c. Immune complexes vere 
5 then added to 50 Ml of protein A-Sepharose slurry and 
incubated for 45 minutes at 4°C. The protein A-Sepharose 
beads vere washed four times with the washing buffer, once 
with distilled water , and the samples were analyzed by SDS- 
gel electrophoresis using 4-15% polyacrylamide gradient 

10 gels and autoradiography. A 70 kDa cross-linked complex 
was precipitated by the VPN antiserum in PAE/TBR-1 cells, 
and a weaker band of the same size was also seen in the 
untransfected cells, indicating that the untransfected PAE 
cells contained a low amount of endogenous ALK-5. The 70 

15 kDa complex was not observed when preimmune serum was used, 
or when immune serum was blocked by 10 /xg of peptide. 
Moreover, a coprecipitated 94 kDa component could also be 
observed in the PAE/TBR-I cells. The latter component is 
likely to represent a TGF-B type II receptor complex, since 

20 an antiserum, termed DRL, which was raised against a 
synthetic peptide from the C-terminal part of the TGF-B 
type II receptor, precipitated a 94 kDa TGF-B type II 
receptor complex, as well as a 70 kDa type I receptor 
complex from PAE/TBR-I cells. 

25 The carbohydrate contents of ALK-5 and the TGF-B type 

II receptor were characterized by deglycosylation using 
endoglycosidase F as described above and analyzed by SDS- 
polyacryl amide gel electrophoresis and autoradiography. 
The ALK-5 cross-linked complex shifted from 70 kDa to €6 

30 kDa, whereas that of the type II receptor shifted from 94 
kDa to 82 kDa. The observed larger shift of the type II 
receptor band compared with that of the ALK-5 band is 
consistent with the deglycosylation data of the type I and 
type II receptors on rat liver cells reported previously 

35 (Cheifetz e£ £l (19*8) J. Biol. Chem. 2£2, 16984-16991), 
and fits well with the fact that th p rcine TGF-B type II 
receptor has two N-glycosylation sites (Lin fil (1992) 
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Cell 775-785) , whereas ALK-5 has only ne (see SEQ ID 
No. 9). 

Binding of TGF-Bl to the type I receptor is known to 
be abolished by transient treatment of the cells with 
dithiothreitol (DTT) (Cheifetz and Kassague (1991) J. Biol 
Che*. 2£1L, 20767-20772; Wrana & ^ (1992 , cell 21, 1003- 
1014) . When analyzed by affinity cross-linking, binding of 
I-TGF-Bl to ALK-5, but not to the type II receptor, was 
completely abolished by DTT treatment of PAE/TBR-l cells 
Affinity cross-linking followed by immunoprecipitation by 
the VPN antiserum showed that neither the ALK-5 nor the 
type II receptor complexes was precipitated after DTT 
treatment, indicating that the VPN antiserum reacts only 
with ALK-5. The data show that the VPN antiserum 
recognizes a TGF-B type I receptor, and that the type I and 
type II receptors form a heteromeric complex. 




cos Cells 

Transient expression plasmids of ALKs -l to -6 and 
20 TBR-II were generated by subcloning into the pSV 7d 
expression vector or into the pcDNA I expression vector 
(Invitrogen) . Transient transfection of cos-l cells and 
iodination of TGF-Bl were carried out as described above. 
Crosslinking and immunoprecipitation were performed as 
25 described for PAE cells above. 

Transfection of cDNAs for ALKs into COS-l cells did 
not show any appreciable binding of 18 l-TGF£l, consistent 
with the observation that type I receptors do not bind TGF- 
B in the absence of type II receptors. When the TBR-it 
cDNA was co-transfected with cDNAs for the different ALKs~ 
type I receptor-like complexes were seen, at different 
levels, in each case. COS-l cells transfected with TBR-n 
and ALK cDNAs were analyzed by affinity crosslinking 
followed by immunoprecipitation using the DRL antisera or 
specific antisera against ALKs. Each one of the ALKs bound 
I-TGF-Bl and was coimmunopr cipitated with the TBR-H 
complex using the DRL antiserum. Comparison of the 
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effici ncy of the different ALKs to torn heteromeric 
complexes with TBR-XI, revealed that ALK-5 formed «uch 
complexes more efficiently than th ther ALKs* Th six 
f the crosslinked complex was larger for ALK-3 than for 
5 other ALKs, consistent with its slightly larger size. 
Expression Of th? ALE Protein in ni C e rent cm 

Two different approaches were used to elucidate which 
ALK's are physiological type I receptors for TGF-B. 

Firstly, several cell lines were tested for the 
10 expression of the ALK proteins by cross-linking followed by 
immunoprecipitation using the specific antiseras against 
ALKs and the TGF-B type II receptor. The mink lung 
epithelial cell line, MvlLu, is widely used to provide 
target cells for TGF-B action and is well characterized 
15 regarding TGF-B receptors (Laiho e£ al (1990) J. Biol. 
Chem. 265_, 18518-18524; Laiho g& ^ (1991) J. Biol. chem. 
216, 9108-9112). only the VPN antiserum efficiently 
precipitated both type I and type II TGF-B receptors in the 
wild type MvlLu cells. The DRL antiserum also precipitated 
20 components with the same size as those precipitated by the 
VPN antiserum. A mutant cell line (R mutant) which lacks 
the TGF-B type I receptor and does not respond to TGF-B 
(Laiho et ai, supra ) was also investigated by cross-linking 
followed by immunoprecipitation. Consistent with the 
25 results obtained by Laiho e£ aj, (199O) , supra the type III 
and type II TGF-B receptor complexes, but not the type I 
receptor complex, were observed by affinity crosslinking. 
crosslinking followed by immunoprecipatition using the DRL 
antiserum revealed only the type II receptor complex, 
30 whereas neither the type I nor type II receptor complexes 
was seen using the VPN antiserum. When the cells were 
metabolically labelled and subjected to immunoprecipitation 
using the VPN antiserum, the 53 kDa ALK-5 protein was 
precipitated in both the wild-type and R mutant MvlLu 
35 cells. These results suggest that the type I receptor 
expressed in the R mutant is ALK-5, which has lost the 
affinity for binding to TGF-B after mutation. 
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The type I and type II TGF-B receptor complexes could 
be precipitated by th VPN and DRL antisera in other cell 
lines, including human for skin fibroblasts (AG1518) , human 
lung adenocarcinoma cells (A549), and human oral squamous 
5 cell carcinoma cells (HSC-2). Affinity cross-linking 
studies revealed multiple TGF-B type I receptor-like 
complexes of 70-77 kDa in these cells. These components 
were less efficiently competed by excess unlabel led TGF-B 1 
in HSC-2 cells. Moreover, the type II receptor complex vas 

10 low or not detectable in A549 and HSC-2 cells. Cross- 
linking followed by immunoprecipitation revealed that the 
VPN antiserum precipitated only the 70 kDa complex among 
the 70-77 kDa components. The DRL antiserum precipitated 
the 94 kDa type II receptor complex as veil as the 70 kDa 

15 type I receptor complex in these cells, but not the 
putative type I receptor complexes of slightly larger 
sizes. These results suggest that multiple type I TGF-B 
receptors may exist and that the 70 kDa complex containing 
ALK-5 forms a heteromeric complex with the TGF-B type II 

20 receptor cloned by Lin fit fil (1992) Cell £&, 775-785 , ©ore 
efficiently that the other species. In rat 

pheochromocytoma cells (PC12) which have been reported to 
have no TGF-B receptor complexes by affinity cross-linking 
(Massagufi fit fil (1990) Ann. N.Y. Acad. Sci. 593 , 59-72), 

25 neither VPN nor DRL antisera precipitated the TGF-B 
receptor complexes. The antisera against ALKs -1 to -4 and 
ALK6 did not efficiently immunoprecipitate the crosslinked 
receptor complexes in porcine aortic endothelial (PAE) 
cells or human foreskin fibroblasts. 

30 Next, it vas investigated whether ALKs could restore 

responsiveness to TGF-B in the R mutant of MvlLu cells, 
which lack the ligand-binding ability of the TGF-B type I 
receptor but have intact type II receptor. Wild-type MvlLu 
cells and mutant cells were transfected with ALK cDNA and 

35 were then assay d for th production of plasminogen 
activator inhibitor-1 (PAI-1) which is produced as a result 
of TGF-B receptor activation as d scribed previously by 
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Laih fit Al (1991) Mol. Cell Biol. H, 972-978. Briefly, 
cells were added with or without io ng/ml f TGF-B1 for 2 
hours in serum-free MCDB 104 without methionine. 
Thereafter, cultures were labelled with [^S] methionine (40 
5 MCi/ml) for 2 hours. The cells were removed by washing on 
ice once in PBS, twice in 10 mM Tris-HCl (pH 8.0), 0.5% 
sodium deoxycholate, 1 mM PMSF, twice in 2 mM Tris-HCl (pH 
8.0) , and once in PBS. Extracellular matrix proteins were 
extracted by scraping cells into the SDS-sample buffer 
10 containing DTT, and analyzed by SDS-gel electrophoresis 
followed by fluorography using Amplify. PAI-l can be 
identified as a characteristic 45kDa band (Laiho fit al 
(1991) Mol. Cell Biol. 11, 972-978). Wild-type MvlLu cells 
responded to TGF-B and produced PAI-1, whereas the R mutant 
15 clone did not, even after stimulation by TGF-B l. Transient 
transfection of the ALK-5 cDNA into the R mutant clone led 
to the production of PAI-1 in response to the stimulation 
by TGF-B1, indicating that the ALK-5 cDNA encodes a 
functional TGF-B type I receptor. In contrast, the R 
20 mutant cells that were transfected with other ALKs did not 
produce PAI-l upon the addition of TGF-B 1. 

Using similar approaches as those described above for 
the identification of TGF-B-binding ALKs, the ability of 
ALKs to bind activin in the presence of ActRll was 
25 examined. COS-1 cells were co-transfected as described 
above. Recombinant human activin A was iodinated using the 
chloramine T method (Mathews and Vale (1991) Cell §£, 973- 
982) . Transfected C0S-1 cells were analysed for binding 
and crosslinking of ,2S l-activin A in the presence or 
30 absence of excess unlabelled activin A. The crosslinxed 
complexes were subjected to immunoprecipitation using DRL 
antisera or specific ALK antisera. 

All ALKs appear to bind activin A in the presence of 
Act R-II. This is more clearly demonstrated by affinity 
35 cross-linking followed by immunopreciptation. ALK-2 and 
ALK-4 bound 12S I-activin A and were coimmunoprecipitat d 
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with ActR-II. Other ALKs als bound 125 I-activin A but with 
a lower efficiency compar d to ALK-2 and ALK-4. 

In order to inv stigat vh th r ALKs ar physiological* 
activin type I receptors, activin responsive cells were 
5 examined for the expression of endogenous activin type I 
receptors. NvlLu cells , as veil as the R mutant, express 
both type I and type II receptors for activin, and the R 
mutant cells produce PAI-1 upon the addition of activin A. 
MvlLu cells were labeled with t2S I-activin A, cross-linked 
' 10 and immunoprecipitated by the antisera against ActR-II or 
ALKs as described above. 

The type I and type II receptor complexes in MvlLu 
cells were immunoprecipitateci only by the antisera against 
ALK-2, ALK-4 and ActR-II. Similar results were obtained 

15 using the R mutant cells. PAE cells do not bind activin 
because of the lack of type II receptors for activin, and 
so cells were transfected with a chimeric receptor, to 
enable them to bind activin, as described herein. A 
plasmid (chim A) containing the extracelluar domain and C- 

20 terminal tail of Act R-II (aaino-acids -19 to 116 and 465 
to 494, respectively (Mathews and Vale (1991) Cell, ££, 
973-982)) and the kinase domain of TflR-II (amino-acids 160- 
543) (Lin e£ al (1992) Cell, ££, 775-785) was constructed 
and transfected into pcDNA/neo (Invitrogen) . PAE cells 

25 were stably transfected with the chim A plasmid by 
electroporation, and cells expressing the chim A protein 
were established as described previously. PAE/Chim A cells 
were then subjected to 125 I-activin A labelling crosslinking 
and immunoprecipitation as described above. 

30 Similar to MvlLu cells, activin type I receptor 

complexes in PAE/Chim A cells were immunoprecipitated by 
the ALK-2 and ALK-4 antisera. These results show that both 
ALK-2 and ALK-4 serve as high affinity type I receptors for 
activin A in these cells. 

35 ALK-1, ALK-3 and ALK-6 bind TGF-B1 and activin A in 

the presence of their respective typ II receptor , but th 
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functi nal consequences f the binding f the ligands 
remains t be lucidat d. 

Th invention has been described by way of example' 
only, without restriction of its scope. The invention is 
defined by the subject matter herein, including the claims 
that follow the immediately following full Sequence 
Listings. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT* 

(A) NAME: Ludwig Institute for Cancer Research 

(B) STRUT 2 St. Mary's Hospital Medical School, Norfolk 

Place 

(C) CITY: Paddington, London 
(£) COUNTRY: United Kingdom 
(F) POSTAL CODE (ZIP)* W IPG 

(ii) TITLE OF INVENTION: PROTEINS HAVING SERINS /THREONINE KINASE 
DOMAINS , CORRESPONDING NUCLEIC ACID MOLECULES $ AND THEIR 
USE 

(iii) NUMBER OP SEQUENCES t 29 

(iv) COMPUTER READABLE FORK: 

(A) MEDIUM TYPE j Floppy disk 

(B) COMPUTER J IBM PC coopetible 

(C) OPERATING SYSTEM: PC-DOS /M?-DOS 

(D) SOFTWARE j Patentln Release #1.0 f Version #1.25 (EPO) 



(2) INFORMATION FOR SEQ ID NO: Is 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1984 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL! NO 

(iii) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 283. .1791 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

AGGAAACGGT TTATTAGGAG GGACTGGTGG AGCTGCGCCA GGGAGGAAGA CCCTGGAATA 60 

AGAAACATTT TTGCTCCAGC CCCCATCCCA CTCCCGGGAG GCTGCCGCCC CAGCTGCGCC 120 

GAGCGAGCCC CTCCCCGGCT CCAGCCCCGT CCGGGGCCGC GCCGGACCCC AGCCCGCCGT 180 

CCAGCGCTGG CGGTGCAACT CCCGCCGCCC GGTGGAGGGC AGGTGGCCCC GGTCCGCCCA 240 



ci lorrm i-rr ruetrr 
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AGGCTACCGC CCOCCCACCC CCAGAGCGCC CCCAGAGGGA CC ATO XCC TTC CCC 294 

Met Thr Leu Cly 

TCC CCC AGO AAA CCC CTT CTC ATC CTC CTC ATG CCC TTC CTC ACC CAC 342 
Ser Pro Arg Lye Gly Lau Leu Met Leu Lau Hmt Ala Leu Val Thr Gin 
5 10 15 20 

GGA GAC CCT GTG AAO CCC TCT CCC CCC CCC CTC CTC ACC TCC ACC TCT 390 
Cly Asp Pro Val Lya Pro Smr Arg Gly Pro Lmu Val Thr Cya Thr Cya 
25 30 35 

CAC ACC CCA CAT TCC AAG CCC CCT ACC TCC CCC CCC CCC TCC TCC ACA 438 
Clu Ser Pro His Cys Lys Cly Pro Thr Cys Arg Cly Ala Trp Cya Thr 
40 45 50 

CTA CTC CTC GTG CCC GAG CAC CCC ACC CAC CCC CAC GAA CAT CCC CCC 486 
Val Val Lmu Val Arg Clu Clu Gly Arg His Pro Gin Glu His Arg Cly 
55 60 65 

TCC CCC AAC TTC CAC ACC CAC CTC TCC ACC CCC CCC CCC ACC CAC TTC 534 
Cys Cly Am Leu His Arg Clu Lau Cya Arg Cly Arg Pro Thr Clu Pha 
70 75 60 

CTC AAC CAC TAC TCC TCC CAC AGC CAC CTC TCC AAC CAC AAC GTG TCC 582 
Val Aon His Tyr Cya Cys Asp Ser His Lau Cys Asn His Asn Val Ser 
85 90 95 100 

CTC CTC CTC CAC CCC ACC CAA CCT CCT TCC CAC CAC CCC CCA ACA GAT 630 
Lau Val Lau Clu Ala Thr Gin Pro Pro Sar Clu Cln Pro Cly Thr Asp 
105 HO 115 

CCC CAG CTC CCC CTC ATC CTG CCC CCC GTG CTC CCC TTC CTC CCC CTC 678 
Cly Gin Lau Ala Lau Xla Lau Cly Pro Val Lau Ala Lau Lau Ala Lau 
120 125 130 

CTG GCC CTC CGT CTC CTC CCC CTC TCC CAT CTC CGA CCC AGC CAC CAC 726 
Val Ala Leu Cly Val Lau Cly Leu Trp His Val Arg Arg Arg Cln Glu 
135 140 145 

AAG CAG CGT CCC CTC CAC ACC GAC CTC CGA CAC TCC ACT CTC ATC CTC 774 
Lys Gin Arg Cly Lau His Ser Clu Leu Gly Glu Ser Sar Lau Xla Lau 
150 155 160 

AAA CCA TCT CAC CAC CCC CAC ACC ATC TTC CCC CAC CTC CTC CAC ACT 822 
Lys Ala Sar Clu Cln Cly Asp Thr Mat Lau Cly Asp Lau Lau Asp Sar 
165 170 175 180 

CAC TGC ACC ACA CCC ACT CCC TCA CCC CTC CCC TTC CTG CTC CAC AGC 870 
Asp Cys Thr Thr Gly Sar Cly Ser Cly Lau Pro Fha Lau Val Cln Arg 
185 X90 195 

ACA GTG CCA CCC CAG CTT CCC TTC CTG CAC TCT CTC CGA AAA GCC CCC 9X8 
Thr Val Ala Arg Gin Val Ala Lau Val Clu Cya Val Gly Lya Cly Arc 
200 205 210 

TAT CGC GAA CTG TGC CCC CCC TTC TGC CAC CGT CAC AGT CTC CCC GTC 966 
Tyr Gly Glu Val Trp Arg Cly Lau Trp Hia Cly Clu Ser Val Ala Val 
215 220 225 
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AAG ATC TTC TCC TCC AGO OAT CAA CAG TCC TOO TTC CGG CAO ACT GAG 1014 
Lys lis Phs Ssr Ssr Arg Asp Clu Gin 8 r Trp Phe Arg Glu Thr Clu 
230 235 240 

ATC TAT AAC ACA CTA TTG CTC AGA CAC GAC AAC ATC CTA CCC TTC ATC 1062 
XI* Tyr Asn Thr Val Leu Lau Arg His Asp Asn lie Leu Cly Ph 21a 
245 250 255 260 

CCC TCA CAC ATG ACC TCC CCC AAC TOG AGC ACG CAG CTC TCC CTC ATC 1110 
Ala Ser Asp Mat Thr 8sr Arg Am Ssr Ssr Thr Gin Leu Trp Leu llm 
265 270 275 

ACC CAC TAC CAC GAG CAC GGC TCC CTC TAG GAC ITT CTC CAG AGA CAG 1158 
Thr His Tyr His Glu His Gly Ser Leu Tyr Asp Phs Lsu Gin Arg Gin 
280 285 290 

ACG CTC GAG CCC CAT CTC OCT CTC AGO CTA OCT CTC TCC COG CCA TCC 1206 
Thr Lsu Glu Pro Bis Lsu Als Lsu Arg Lsu Ala Vsl Ssr Ala Ala Cys 
295 300 305 

GGC CTC CCC CAC CTC CAC CTG GAG ATC TTC GGT ACA CAG GGC AAA CCA 1254 
Gly Lau Ala His Lau His Val Glu lis Phs Gly Thr Gin Cly Lys Pro 
310 315 320 

GCC ATT GCC CAC CCC GAC TTC AAG AGC CCC AAT CTG CTG GTC AAG AGC 1302 • 

Als lis Als His Arg Asp Phs Lys Ssr Arg Asn Val Lsu Val Lys Ear 
325 330 335 340 

AAC CTC CAG TCT TCC ATC CCC GAC CTC GCC CTG CCT CTG ATG CAC TCA 1350 
Asn Leu Gin Cys Cys lis Ala Asp Leu Gly Leu Ala Val Hat His Ser 
345 350 355 

CAC CCC ACC GAT TAC CTG GAC ATC GGC AAC AAC CCC AGA CTG GGC ACC 1398 
Gin Gly Ser Asp Tyr Leu Asp lie Gly Asn Asn Pro Arg Val Gly Thr 
360 365 370 

AAG CCC TAC ATG CCA CCC GAG CTC CTG GAC CAG CAG ATC CCC ACG GAC 1446 
Lys Arg Tyr Kst Ala Pro Glu Val Leu Asp Glu Gin lis Arg Thr Asp 
375 380 385 

TCC TTT CAG TCC TAC AAG TGG ACT GAC ATC TCC CCC TTT GGC CTC GTC 1494 
Cys Phs Glu Ssr Tyr Lys Trp Thr Asp lis Trp Ala Pha Gly Leu Val 
390 395 400 

CTC TGG GAG ATT CCC CCC CGG ACC ATC CTG AAT GCC ATC CTG GAG CAC 1542 
Leu Trp Glu He Ala Arg Arg Thr lie Val Asn Gly lis Val Glu Asp 
405 410 415 420 

TAT AGA CCA CCC TTC TAT GAT CTG CTC CCC AAT CAC CCC AGC TTT CAC 1590 
Tyr Arg Pro Pro Phs Tyr Asp Val Val Pro Asn Asp Pro Ssr Phs Glu 
425 430 435 

CAC ATG AAG AAG GTG CTG TCT CTG GAT CAG CAG ACC CCC ACC ATC CCT 1638 
Abo Met Lys Lys Vsl Val Cys Val Asp Gin Gin Thr Pro Thr Zle Pro 
440 445 450 



AAC CGG CTG CCT GCA CAC CCG GTC CTC TCA GGC CTA GCT CAC ATG ATG 
Asn Arg Leu Ala Ala Asp Pro Val Leu Ser Gly Leu Ala Gin Met Met 
455 460 465 



1686 
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CCC GAG TCC TOG TAC CCA AAC CCC TCT CCC CGA CTC ACC GCG CTO CGC 1734 
Arg Glu Cya Trp Tyr Pro Am Pro Ser Ala Arg Leu Thr Ala Leu Arg 
470 475 480 

ATC AAG AAG ACA CTA CAA AAA ATT AGC AAC AGT CCA GAG AAG CCT AAA 1782 
lie Lyi Lya Thr Leu Gin Lya lit Ser Aan Ser Pro Glu Lya Pro Lya 
485 490 495 500 

CTC ATT CAA TAGCCCAGCA CCACCTCATT CCTTTCTGCC TGCAGGGCCC 1831 
Val I la Gin 

TGGGGGCGTG GCGCGCAGTG GATGGTGCCC TATCTGGGTA CAGCTACTCT GAGTCTGGTC 1891 

TCTGCTGGCG ATCGGCAGCT GCGCCTGCCT GCTCGGCCCC CAGCCCACCC AGCCAAAAAT 1951 

ACAGCTGGGC TGAAACCTGA AAAAAAAAAA AAA 1984 

(2) INFORMATION POR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 amino acide 

(B) TYPE: amino acid 
(D) TOPOLOGY: iinaar 

(ii) MOLECULE TYPE: protain 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Thr Leu Gly Ser Pro Arg Lya Gly Leu Leu Met Leu Leu Met Ala 
1 5 10 15 

Leu Val Thr Gin Gly Aap Pro Val Lya Pro Ser Arg Gly Pro Leu Val 
20 25 30 

Thr Cya Thr Cya Glu Ser Pro Hia Cya Lya Gly Pro Thr Cya Arg Gly 
35 40 45 

Ala Trp Cya Thr Val Val Leu Val Arg Glu Glu Gly Arg Hia Pro Gin 
50 55 60 

Glu Hii Arg Gly Cya Gly Aan Leu Hia Arg Glu Leu Cya Arg Gly Arg 
65 70 75 80 

Pro Thr Glu Phe Val Aan Hia Tyr Cya Cya Aap Ser Hia Leu Cya Aan 
85 90 95 

Hie Aan Val Ser Leu Val Leu Glu Ala Thr Gin Pro Pro Ser Glu Gin 
100 105 110 

Pro Gly Thr Aap Gly Gin Leu Ala Leu lie Leu Gly Pro Val Leu Ala 

115 120 125 

Leu Leu Ala Leu Val Ala Leu Gly Val Leu Gly Leu Trp Hia Val Arc 
130 135 140 

Arg Arg Gin Glu Lya Gin Arg Cly Leu Hia Ser Glu Leu Gly Glu Ser 
145 150 155 160 
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Ser Leu He Leu Lys Ala Ser Clu Cln Gly Asp Thr Mst L u Cly Asp 

165 170 175 

Leu Leu Asp Ser Asp Cys Thr Thr Cly Smr Gly Ser Cly L u Pr Ph 
180 185 190 

Leu Val Cln Arg Thr Val Ala Arg Cln Val Ala Leu Val lu Cys Val 

195 200 205 

Cly Lys Cly Arg Tyr Cly Clu Val Trp Arg Cly Leu Trp His Cly Clu 
210 215 220 

Ser Val Ala Val Lys He Phe Ser Ser Arg Asp Clu Cln Ser Trp Phe 

225 230 235 240 

Arg Glu Thr Glu He Tyr Asn Thr Val Leu Leu Arg His Asp Asn He 
245 250 255 

Leu Gly Phe He Ala Ser Asp Met Thr Ser Arg Asn Ser Ser Thr Gin 

260 265 270 

Leu Trp Leu He Thr His Tyr Hit Glu His Cly Ssr Leu Tyr Asp Phe 

275 280 285 

Leu Gin Arg Gin Thr Leu Glu Pro His Leu Ala Leu Arg Leu Ala Val 
290 295 300 

Ser Ala Ala Cyt Gly Leu Ala Hit Leu His Val Glu He Phe Cly Thr 

305 310 315 320 

Gin Gly Lys Pro Ala He Ala Hit Arg Aip Phe Lys Ser Arg Asn Val 

325 330 335 

Leu Val Lys Ser Asn Leu Gin Cyi Cys He Ala Asp Leu Gly Leu Ala 
340 345 350 

Val Met His Ser Gin Cly Ser Asp Tyr Leu Asp He Cly Asn Asn Pro 

355 360 365 

Arg Val Cly Thr Lys Arg Tyr Met Ala Pro Clu Val Leu Asp Glu Cln 
370 375 380 

He Arg Thr Aip Cys Phe Clu Ser Tyr Lys Trp Thr A«p He Trp Ala 

385 390 395 400 

Phe Gly Leu Val Leu Trp Glu lis Ala Arg Arg Thr He Val Asn Cly 
405 410 415 

He Val Clu Asp Tyr Arg Pro Pro Phe Tyr Asp Val Val Pro Asn Asp 
420 425 430 

Pro Ser Phe Glu Atp Met Lys Lys Val Val Cys Val Asp Gin Gin Thr 
435 440 445 

Pro Thr He Pro Asn Arg Leu Ala Ala Asp Pro Val Leu Ser Cly Leu 
450 455 460 

Ala Gin Met Met Arg Glu Cys Trp Tyr Pro Asn Pro Ser Ala Arg Leu 
465 470 475 480 
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Thr Ala Leu Arg He Lys Lye Thr Leu In Lys XI Ser Asn Mr Fro 
485 490 495 

Glu Lyt Pro Lys Val lie Cln 

500 



(2) INFORMATION FOR SEQ ZD NOl 3i 

(i) SEQUENCE CHARACTERISTICS X 

(A) LENGTH: 2724 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNSSS : unknown 
(O) TOPOLOGY s linear 

(ii) MOLECULE TYPE i CDNA 

(iii) HYPOTHETICAL! HO 

<iii) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hocao sapiens 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 104.. 1630 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CTCCGAGTAC CCCAGTGACC AGAGTGAGAG AAGCTCTGAA CGAGGGCAOG CGGCTTGAAG 60 

CACTGTGGGC AGATCTGACC AAGAGCCTGC ATTAAGTTGT ACA ATG GTA GAT GGA 115 

Met Val Aap Gly 



GTG ATG ATT CTT CCT GTG CTT ATC ATG ATT GCT CTC CCC TCC CCT ACT 163 
Val Met He Leu Pro Val Leu He Met He Ala Leu Pro Ser Pro Ser 
5 10 15 20 

ATG GAA GAT GAG AAG CCC AAC GTC AAC CCC AAA CTC TAC ATG TGT GTG 211 
Met Glu Asp Glu Lya Pro Lye Val Asn Pro Lya Leu Tyr Met Cye Val 
25 30 35 



TGT GAA GOT CTC TCC TGC GGT AAT GAG GAC CAC TGT GAA GGC CAG GAG 
Cys Glu Gly Leu Ser Cye Gly Aen Glu Aep Hie Cye Glu Gly Gin Gin 
40 45 SO 



259 



TGC TTT TCC TCA CTG AGC ATC AAC GAT GGC TTC CAC CTC TAC CAG AAA 
Cys Phe Ser Ser Leu Ser He Asn Aep Gly Phe Hie Val Tyr Gin Lya 
55 60 65 



307 



GGC TGC TTC CAG CTT TAT GAG CAG GGA AAG ATG ACC TGT AAG ACC CCG 
Gly Cye Phe Gin Val Tyr Glu Gin Gly Lye Met Thr Cys Lys Thr Pro 
70 75 60 



355 
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CCC TCC CCT GGC CAA CCT CTC GAG TCC TGC CAA CGC GAC TGC TOT AAC 403 
Pro Ser Fro Cly Gin Ala Val Glu Cys Cys Gin Oly Asp Trp Cys Asa 
85 90 95 100 

AGG AAC ATC ACG CCC CAG CTC CCC ACT AAA GGA AAA TCC TTC CCT GGA 451 
Arg Asn lit Thr Alt Gin Leu Pro Thr Lys Cly Lys Ssr Phe Pro Gly 
105 110 115 

ACA CAG AAT TTC CAC TTC GAG CTT CGC CTC ATT ATT CTC TCT CTA CTC 499 
Thr Gin A»n Phe Hit Leu Glu Val Cly Leu 11m 11m Leu Smr Val Val 
120 125 130 

TTC CCA CTA TCT CTT TTA GCC TCC CTC CTC GGA CTT CCT CTC CGA AAA 547 
Phe Ala Val Cys Leu Leu Ala Cys Leu Leu Gly Val Ala Leu Arg tys 
135 140 145 

TTT AAA AGG CGC AAC CAA GAA CCC CTC AAT CCC CGA GAC CTO CM TAT 595 
Pha Lys Arg Arg Asn Gin Glu Arg Leu Asn Pro Arg Asp Val Glu Tyr 
150 155 160 

GGC ACT ATC CAA GGG CTC ATC ACC ACC AAT CTT GGA GAC ACC ACT TTA 643 
Gly Thr I la Glu Gly Leu Xla Thr Thr Asn Val Gly Asp Sar Thr Leu 
165 170 175 180 

CCA CAT TTA TTC CAT CAT TCC TCT ACA TCA CGA ACT CCC TCT CCT CTT 691 
Ala Asp Leu Leu Asp His Sar Cys Thr Ser Gly Smr Gly Ssr Gly Leu 
185 190 195 

CCT TTT CTC GTA CAA AGA ACA CTC CCT CCC CAC ATT ACA CTC TTG GAG 739 
Pro Pha Leu Val Gin Arg Thr Val Ala Arg Gin 21a Thr Leu Leu Glu 
200 205 210 

TGT CTC CGC AAA CGC ACG TAT CCT GAG CTG TGC AGG CCC ACC TOG CAA 787 
Cys Val Cly Lys Gly Arg Tyr Gly Glu Val Trp Arg Gly Ser Trp Gin 
215 220 225 

GGC CAA AAT CTT CCC CTG AAC ATC TTC TCC TCC CCT CAT CAC AAC TCA 835 
Gly Glu Asn Val Ala Val Lys He Phe Ser Ser Arg Asp Glu Lys Ser 
230 235 240 

TGG TTC AGG CAA ACG CAA TTC TAC AAC ACT CTC ATC CTC AGG CAT CAA 883 
Trp Phe Arg Clu Thr Glu Leu Tyr Asn Thr Val Met Leu Arg His Glu 
245 250 255 260 

AAT ATC TTA GGT TTC ATT CCT TCA CAC ATC ACA TCA ACA CAC TCC ACT 931 
Asn He Leu Gly Phe He Ala Ssr Asp Met Thr Ser Arg His Ser Ser 
265 270 275 

ACC CAC CTC TGC TTA ATT ACA CAT TAT CAT CAA ATC GGA TCC TTC TAC 979 
Thr Gin Leu Trp Leu He Thr His Tyr His Glu Met Gly Ser Leu Tyr 
280 285 290 

CAC TAT CTT CAC CTT ACT ACT CTC CAT ACA CTT ACC TCC CTT CCA ATA 1027 
Asp Tyr Leu Gin Leu Thr Thr Leu Asp Thr Val Ser Cys Leu Arg He 
295 300 305 



CTG CTG TCC ATA CCT ACT GGT CTT CCA CAT TTG CAC ATA GAG ATA TTT 1075 
Val Leu Ser He Ala Ser Gly Leu Ala His Leu His He Glu He Phe 
310 315 320 
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GCC ACC CAA CCC AAA OCX GCC ATT GCC CAT CCA CAT TTA AAC ACC AAA 1123 
Cly Thr Gin Cly Lys Pro Ala Zl Ala His Arg Asp Leu Lys Ser Ly» 
325 330 335 340 

AAT ATT CTC CTT AAO AAC AAT CCA CAO TCT TCC ATA CCA CAT TTC CCC 1171 
Asn lie Leu Val Lys Lys Asn Cly Gin Cys Cya Zle Ala Asp Leu Cly 
345 350 355 

CTC CCA CTC ATC CAT TCC CAC ACC ACC AAT CAO CTT GAT CTC GCC AAC 1219 
Leu Ala Val Mat His Sar Gin Sar Thr Asn Gin Leu Asp Val Cly Asn 
360 365 370 

AAT CCC CCT CTC GCC ACC AAC CCC TAC ATC CCC CCC CAA CTT CTA GAT 1267 
Asn Pro Arg Val Gly Thr Lys Xrg Tyr Mat Ala Pro Glu Val Leu Asp 
375 380 385 

CAA ACC ATC CAC CTC CAT TCT TTC CAT TCT TAT AAA ACC CTC CAT ATT 1315 
Glu Thr lis Cln Val Asp Cys Phe Asp Sar Tyr Lys Arg Val Asp Zle 
390 395 400 

TGG CCC TTT CCA CTT CTT TTC TCC CAA CTC CCC ACC CCC ATC CTC ACC 1363 
Trp Ala Phs Gly Lau Val Leu Trp Glu Val Ala Arg Arg Met Val Ser 
405 410 415 420 

AAT CCT ATA CTC CAC CAT TAC AAC CCA CCG TTC TAC CAT CTG CTT CCC 1411 
Asn Gly lis Val Glu Asp Tyr Lys Pro Pro Phs Tyr Asp Val Val Pro 
425 430 435 

AAT CAC CCA ACT TTT CAA CAT ATC AGG AAC CTA CTC TCT CTG GAT CAA 1459 
Asn Asp Pro Ssr Phs Glu Asp Met Arg Lys Val Val Cys Val Asp Gin 
440 445 450 

CAA ACC CCA AAC ATA CCC AAC ACA TCC TTC TCA CAC CCC ACA TTA ACC 1507 
Cln Arg Pro Am lis Pro Asn Arg Trp Phe Sar Asp Pro Thr Leu Thr 
455 460 465 

TCT CTC CCC AAC CTA ATC AAA CAA TCC TCC TAT CAA AAT CCA TCC CCA 1555 
Ser Leu Ala Lys Leu Met Lys Glu Cys Trp Tyr Gin Asn Pro Ssr Ala 
470 475 480 

ACA CTC ACA CCA CTC CCT ATC AAA AAC ACT TTC ACC AAA ATT CAT AAT 1603 
Arg Leu Thr Alt Leu Arg Zle Lys Lys Thr Leu Thr Lys Zle Asp Asn 
485 490 495 500 

TCC CTC CAC AAA TTC AAA ACT CAC TCT TCACATTTTC ATACTCTCAA 1650 
Ser Leu Asp Lys Leu Lys Thr Asp Cys 
505 



CAACCAACAT 


TTCACCTTCT 


TCTCATTCTC 


CACCTCCCAC CTAATCCTCC 


CCTGACTCGT 


1710 


TCTCAGAATG 


CAATCCATCT 


CTCTCCCTCC 


CCAAATCCCT CCTTTCACAA 


GCCACACCTC 


1770 


CTACCCACCC 


ATCTCTTCCC 


CACACATCAA 


AACCACCCTA ACCTCCCTCC 


ATCACTCTGA 


1830 


ACTCCCCATT 


TCACCAACTC 


TTCACACTCC 


AGAGACTAAT CTTCCACACA 


CACTCTTCCA 


1890 - 


AACCTACCCA 


CTCCACCAAC 


ACAGAGAAAT 


CCTAAAAGAG ATCTGCCCAT 


TAACTCAGTC 


1950 



GCTTTCCATA CCTTTCACAA CTCTCCTACA CACTCCCCAC CCCAAACTCA ACCACCTCCT 2010 
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GAATTTTTAA TCAGCAATAT TOCCTCTCCT T CTCTTCTTT ATTGCACTAG GAATTCTTTG 2070 

CATTCCTTAC TTCCACTGTT ACTCTTAATT TTAAAGACCC AACTTGCCAA AATGTTCGCT 2130 

GCGTACTCCA CTCCTCTCTC TTTGGATAAT ACCAATTCAA TTTGCGAAAA CAAAATCTAA 2190 

TCTCACACTT TCCTGCATTT TACACATGTG CTCATGTTTA CAATGATCCC GAACATTAGG 2250 

AA T TCTT TA T ACACAACTT7 GCAAATTATT TATTACTTGT CCACTTAGTA CTTTTTACAA 2310 

AACTGCTTTC TCCATATGTT AAAGCTTATT TCTATCTCCT CTTATGATTT TATTACACAA 2370 

ATCTTTTTAA CACTATACTC TAAAATGGAC ATTTTCTTTT ATTATCACTT AAAATCACAT 2430 

TTTAACTCCT TCACATTTGT ATCTGTCTAC ACTCXAACTT TTTTTCACTT GATATGCAGA 2490 

ACGXATTTAC CCATTACCCA CCTGACACCA CCCAATATAT TATCCATTTA GAAGCAAAGA 2550 

TTTCAGTACA ATTTTACTCC TCAACGCTAC GCGGAAAATG CATTTTCTTC ACAATTATCC 2610 

ATTACCTCCA TTTAAACTCT CCCACAAAAA AATAACTATT TTCTTTTAAT CTA CT TTT TC 2670 

TATTTACTAC TTATTTCTAT AAATTAAATA AACTGTTT?C AAGTCAAAAA AAAA 2724 

(2) INFORMATION FOR SEQ ID NO: 4s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 509 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Mat Val Asp Cly Val Mat lis Leu Pro Val Leu He Met He Ala Leu 

1 5 10 15 

Pro Ser Pro Ser Met Clu Asp CIu Lys Pro Lye Val Asn Pro Lys Leu 
20 25 30 

Tyr Met Cys Val Cys Clu Gly Leu Ser Cys Cly Asn Glu Asp His Cys 
35 40 45 

Clu Gly Gin Gin Cys Phe Ser Ser Leu Ser He Asn Asp Cly Phe His 

50 55 60 

Val Tyr Gin Lys Cly Cys Phe Gin Val Tyr Glu Gin Gly Lys Met Thr 
65 70 75 80 

Cys Lys Thr Pro Pro Ser Pro Gly Gin Ala Val Glu Cys Cys Gin Gly 
85 90 95 

Asp Trp Cys Asn Arg Asn He Thr Ala Gin Leu Pro Thr Lys Gly Lys 
100 105 110 

Ser Phe Pro Gly Thr Gin Asn Phe Hie Leu Glu Val Gly Leu He He 
115 120 125 
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Leu Sar val Val Ph Ala Val Cys Leu Leu Ala Cys Leu Leu Gly Val 

130 135 140 

Ala Lau Arg Lys Phs Lys Arg Arg Asn Cln Glu Arg Leu Asn Pro Arg 

145 150 155 160 

Asp Val Clu Tyr Cly Thr lis Clu Cly Leu Zls Thr Thr Asn Val Cly 
165 170 175 

Asp Ser Thr Lau Ala Aap Lau Lau Asp Bis Sar Cys Thr Sar Cly Sar 
160 185 190 

Cly Ser Cly Lau Pro Pha Lau Val Cln Arg Thr Val Ala Arg Cln Ila 
195 200 205 

Thr Lau Lau Clu Cys Val Cly Lys Cly Arg Tyr Cly Clu Val Trp Arg 
210 215 220 

Cly Ser Trp Cln Cly Glu Asn Val Ala Val Lys Ila Pha Sar Sar Arg 
225 230 235 240 

Asp Clu Lys Ser Trp Pha Arg Clu Thr Clu Lau Tyr Asn Thr Val Mat 
245 250 255 

Lau Arg His Clu Asn lie Lau Gly Pha Zla Ala Sar Asp Mat Thr Sar 

260 265 270 

Arg His Ser Sar Thr Gin Lau Trp Lau Ila Thr His Tyr His Clu Mat 
275 280 285 

Cly Ser Leu Tyr Asp Tyr Leu Cln Leu Thr Thr Leu Asp Thr Val Ser 
290 295 300 

Cys Leu Arg lie Val Leu Ser lie Ala Ser Cly Lau Ala His Lau His 

305 310 315 320 

Ila Glu He Phe Gly Thr Gin Gly Lys Pro Ala He Ala His Arg Asp 

325 330 335 

Leu Lys Ser Lys Asn He Leu Val Lys Lys Asn Cly Cln Cys Cys He 
340 345 350 

Ala Asp Leu Gly Leu Ala Val Met His Ser Cln Ser Thr Asn Cln Leu 

355 360 365 

Asp Val Cly Asn Asn Pro Arg Val Cly Thr Lys Arg Tyr Met Ala Pro 
370 375 380 

Clu Val Leu Asp Clu Thr He Cla Val Asp Cys Phe Asp Ser Tyr Lys 
385 390 395 400 

Arg Val Asp He Trp Ala Phe Gly Leu Val Leu Trp Glu Val Ala Arg 
405 410 415 

Arg Met Val Ser Asn Cly Zla Val Clu Asp Tyr Lys Pro Pro Phe Tyr 
420 425 430 

Asp Val Val Pro Asn Asp Pro Ser Phe Clu Asp Mat Arg Lye Val Val 
435 440 445 
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Cys Val Asp Gin Cln Arg Pro Am lis Fro Asn Arg Trp Phe Ser Alp 
450 455 460 

Pro Thr Lou Thr Ser Lou Ala Lyf Liu Nat Ly* Clu Cys Trp Tyr Ola 
465 470 475 480 

Asn Pro Ser Ala Arg Lou Thr Ala Leu Arg XI* Ly* Ly* Thr Lou Thr 
485 490 495 

Ly* II* Asp Am 6or Lou Asp Ly* Lou Ly* Thr Asp Cy* 

500 505 



(2) INTORXATION FOR SEQ ID NO: Si 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 2932 baa* pairs 

(B) TYPE: nueloie acid 

(C) 5TRANDEDNESS: unknown 

(D) TOPOLOGY i linear 

(ii) MOLECULE TYPE: cDKA 

(iii) HYPOTHETICAL! KO 

(iii) ANTI-SENSE s NO 

(v) FRAG KENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM i Hocno sapions 

(ix) FEATOTi J 

(A) NAKX/KZY: COS 

(8) LOCATION: 310. .1905 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GCTCCGCGCC CAGGGCTGCA CGATGCGTTC CCTGGCCTCC CGACTTATCA AAATATGGAT 
CAGTTTAATA CTGTCTTCCA ATTCATCAGA TGGAAGCATA CGTCAAAGCT GTTTGCAGAA 
AATCAGAAGT ACAGTTTTAT CTAGCCACAT CTTGGAGGAC TCCTAAGAAA CCAGTGGGAG 
TTGAAGTCAT TGTCAAGTGC TTCCGATCTT TTACAAGAAA ATCTCACTGA ATCATAGTCA 
TTTAAATTGG TGAXGTAGCA AGACCAATTA TTAAACGTGA CACTACACAG CAAACATTAC 
AATTGAACA ATG ACT CAG CTA TAC ATT TAC ATC ACA TTA TTC CCA CCC 



TAT TTG TTC ATC ATT TCT CCT GTT CAA GGA CAG AAT CTG CAT ACT ATG 
Tyr Leu Phe lis Xlo Ser Arg Val Gin Gly Gin Asn Lou Asp Sor Kot 



60 
120 
180 
240 
300 
348 

396 




15 



20 



25 
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CTT CAT OGC ACT GGG ATG AAA TCA CAC TCC GAC CAG AAA AAG TCA CAA 444 
Leu His Gly Thr Gly Kst Lys Ssr Asp Ssr Asp Gin Lys Lys Ssr Olu 
30 35 40 45 

AAT GO A CTA ACC TTA CCA CCA GAG GAT ACC TTC CCT TTT TTA AAO TCC 492 
Am Gly Val Thr Lsu Ala Fro Glu Asp Thr Lsu Pro Phs Leu Lys Cys 
50 55 60 

TAT TGC TCA GGG CAC TOT CCA GAT GAT CCT ATT AAT AAC ACA TCC ATA 540 
Tyr Cys Ser Gly Bis Cys Pro Asp Asp All lis Asa Asa Thr Cys lis 
65 70 75 

ACT AAT GGA CAT TGC TTT GCC ATC ATA GAA GAA GAT GAC CAG GGA CAA 568 
Thr Asn Gly Bis Cys Phs Alt Zls lis Glu Glu Asp Asp Gin Gly Glu 
80 65 90 

ACC ACA TTA GCT TCA GGG TCT ATG AAA TAT GAA GGA TCT CAT TTT CAG 636 
Thr Thr Lsu Ala Ssr Gly Cys Mst Lys Tyr Glu Gly Ssr Asp Phs Gin 
95 100 105 

TCC AAA GAT TCT CCA AAA GCC CAG CTA CCC CGG ACA ATA CAA TCT TCT 684 
Cys Lys Asp Ssr Pro Lys Ala Gin Leu Arg Arg Thr Zls Glu Cys Cys 
110 115 120 125 

CCC ACC AAT TTA TCT AAC CAG TAT TTC CAA CCC ACA CTG CCC CCT CTT 732 
Arg Thr Asn Leu Cys Asn Gin Tyr Lsu Gin Pro Thr Leu Pro Pro Val 
130 135 140 

CTC ATA GCT CCC TTT TTT CAT CCC AGC ATT CCA TCC CTG CTT TTC CTC 780 
Val lie Gly Ppo Phs Phs Asp Gly Ser lis Arg Trp Leu Val Leu Lsu 
145 150 155 

ATT TCT ATC CCT CTC TCC ATA ATT CCT ATG ATC ATC TTC TCC ACC TCC 826 
lie Ser Hot Ala Val Cys lis lis Ale Kst Zls Zls Phs Ssr Ssr Cys 
160 165 170 

TTT TCT TAC AAA CAT TAT TCC AAC AGC ATC TCA AGC ACA CCT CCT TAC 876 
Phe Cys Tyr Lys His Tyr Cys Lys Ssr Zls Ssr Ssr Arg Arg Arg Tyr 
175 180 185 

AAT CCT CAT TTC CAA CAG CAT GAA CCA TTT ATT CCA CTT GGA GAA TCA 924 
Asn Arg Asp Leu Glu Gin Asp Glu Als Phs Zls Pro Vsl Gly Glu Ssr 
190 195 200 205 

CTA AAA CAC CTT ATT GAC CAG TCA CAA ACT TCT OCT ACT CGG TCT GGA 972 
Leu Lys Asp Lsu Zls Asp Gin Ser Gin ssr Ssr Gly Ssr Gly Ssr Gly 
210 215 220 

CTA CCT TTA TTC GTT CAC CCA ACT ATT CCC AAA CAG ATT CAC ATG CTC 1020 
Leu Pro Lsu Leu Vsl Gin Arg Thr Zls Als Lys Gin Zls Gin Mst Val 
225 230 235 

CGG CAA GTT GGT AAA GCC CCA TAT GGA CAA GTA TOG ATG GCC AAA TCC 1068 
Arg Gin Val Gly Lys Gly Arg Tyr Gly Glu Val Trp Kst Gly Lys Trp 
240 245 250 



CCT GCC CAA AAA CTC GCC CTC AAA GTA TTC TTT ACC ACT CAA GAA GCC 1116 
Arg Gly Glu Lys Val Ala Val Lys Val Phs Phs Thr Thr Glu Glu Ala 
255 260 265 
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acc too rrr cga gaa aca gaa atc tag caa act ctc cta ato ccc cat U64 
s r Trp Phc Arg Glu Thr Clu lit Tyr Cln Thr Val Leu Nit Arg Bli 
270 275 280 28S 

CAA AAC ATA CTT OCT TTC ATA GOG CCA GAC ATT AAA GCT ACA CGT TCC 1212 
Clu Asn Ila Leu Gly Phe lie Ala Ala Asp II* Lye cly Thr Gly Ser 
290 295 300 

TCC ACT CAG CTC TAT TTC ATT ACT GAT TAC CAT CAA AAT CGA TCT CTC 1260 
Trp Thr Gin Leu Tyr Lau lie Thr Asp Tyr His Glu Asn Gly Ser Lau 

305 310 315 

TAT GAC TTC CTC AAA TCT CCT ACA CTC CAC ACC AGA CCC CTC CTT AAA 1308 
Tyr Asp Phe Lou Lye Cys Ala Thr Leu Asp Thr Arg Ala Lau Lau Lys 
320 325 330 

TTC CCT TAT TCA CCT CCC TCT CCT CTC, TCC CAC CTC CAC ACA CAA ATT 1356 
Leu Ala Tyr Ser Ala Ala Cya Gly Lau Cys fiia Lau Hia Thr Clu Ila 
335 340 345 

TAT CCC ACC CAA CCA AAC CCC CCA ATT CCT CAT CCA CAC CTA AAC ACC 1404 
Tyr Cly Thr Gin Gly Lys Pro Ala Ila Ala Bis Arg Asp Lau Lys Sar 
350 355 360* 365 

AAA AAC ATC CTC ATC AAC AAA AAT CGG ACT TCC TCC ATT CCT GAC CTC 1452 
Lys Asn lie Lau Ila Lys Lys Asn Cly Sar Cys Cys Ila Ala Asp Lau 
370 375 380 

CCC CTT CCT CTT AAA TTC AAC ACT CAC ACA AAT CAA CTT CAT CTC CCC 1500 
Cly Lou Ala Val Lys Pha Atn Sar Asp Thr Asn Glu Val Asp Val Pro 
385 390 395 

TTC AAT ACC AGO CTC CCC ACC AAA CCC TAC ATC CCT CCC GAA CTC CTC 1548 
Leu Asn Thr Arg Val Cly Thr Lys Arg Tyr Met Ala Pro Clu Val Lau 
400 405 410 

CAC CAA ACC CTC AAC AAA AAC CAC TTC CAC CCC TAC ATC ATC CCT CXC 1596 
Asp Glu Ser Leu Asn Lys Asn His Pha Gin Pro Tyr Ila Mat Ala Aip 
415 420 425 

ATC TAC ACC TTC CCC CTA ATC ATT TCC GAC ATC CCT CCT CCT TCT ATC 1644 
lie Tyr Ser Pha Cly Leu Ila Ila Trp Glu Met Ala Arg Arg Cys Ila 
430 435 440 445 

ACA CGA CCC ATC CTC CAA GAA TAC CAA TTC CCA TAT TAC AAC ATC CTA 1692 
Thr Cly Cly Ila Val Clu Clu Tyr Gin Lau Pro Tyr Tyr Asn Hat Val 
450 455 460 

CCG ACT CAT CCC TCA TAC GAA CAT ATC CGT GAC CTT CTC TCT GTC AAA 1740 
Pro Ser Asp Pro Sar Tyr Clu Asp Mat Arg Clu Val Val Cys Val Lya 
465 470 475 

CCT TTC CCC CCA ATT CTC TCT AAT CCC TCC AAC ACT CAT CAA TCT CTA 1788 
Arg Leu Arg Pro Ila Val Sar Asn Arg Trp Asn Sar Asp Clu Cys Lau 
480 485 490 

CCA CCA CTT TTC AAC CTA ATC TCA GAA TCC TCC CCC CAC AAT CCA CCC 1836 
Arg Ala Val Leu Lys Leu Met Ser Glu Cys Trp Ala His Asn Pro Ala 
495 500 505 
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TCC AGA CTC ACA OCX TTO ACA ATT AAO AAO AGO CTT OCC AAO ATO CTT 1684 
Sar Arg Lau Thr Ala Lau Arg Zla Lya Lya Thr Lau Ala Lya Mat Val 

510 515 520 525 

CAA TCC CAA CAT CTA AAA ATC TCATCCTTAA ACCATCCCAC CACAAACTCT 1935 
clu Sar Gin Asp Val Lya Zla 
530 

ACACTCCAAC AACTCTTTTT ACCCATCCCA TCOGTCCAAT TAGACTGCAA TAAGGATCTT 1995 

AACTTGCTTC TCACACTCTT TCTTCACTAC CTCTTCACAC CCTCCTAATA TTAAACCTTT 2055 

CACTACTCTT ATTACCATAC AAGCTGGGAA CTTCTAAACA CTTCATTCTT TATATATGCA 2115 

CACCTTTATT TTAAATCTCC TTTTTCATGC CTTTTTTTAA CTCCC TTTTT ATGAACTCCA 2175 

TCAAGACTTC AATCCTCATT ACTCTCTCCA CTCAAOCTCT CCCTACTCAA TT GCC T O TT C 2235 

ATAAAACGGT CCTTTCTCTG AAACCCTTAA CAACATAAAT CAGCCCAGCA GACATCGAGA 2295 

AATACACTTT CCCTTTTACC TCACACATTC ACTTCCTTTC TATTCTACCT TTCTAAAACA 2355 

CCCTATACAT GATGATCTGT TTCCCATACT CCTTATTTTA TGATACTTTC TCCT C T C TCC 2415 

TTACTCATCT CTCTCTCTCT CCATGCACAT CCACCCCCCC ATTCCTCTCC TCCCATTTCA 2475 

ATTAGAAGAA AATAATTTAT ATGCATGCAC AGGAAGATAT TGGTGGCCGC TG CTTTTCTG 2S35 

CTTTAAAAAT CCAATATCTG ACCAAGATTC CCCAATCTCA TACAAGCCAT TTACTTTCCA 2595 

AGTGACATAG CTTCCCCACC AGCTTTATTT TTTAACATCA AAGCTGATGC CAAGGCCAAA 2655 

AGAAGTTTAA AGCATCTCTA AATTTCGACT GTTTTCCTTC AACCACCATT TTTTTTCTG C 2715 

TTATTATTTT TGTCACGCAA AGCATCCTCT CCAAACTTGG AGCTTCTATT GCCATGAACC 2775 

ATGCTTACAA AGAAAGCACT TCTTATTGAA GTGAATTCCT GCATTTGATA GCAATGTAAG 2835 

TCCCTATAAC CATGTTCTAT ATTCTTTATT CTCACTAACT TTTAAAAGGG AAGTTATTTA 2895 

TATTTTGTGT ATAATCTGCT TTATTTGCAA ATCACCC 2932 

(2) INFORMATION TOR SEQ ZD HOi 6: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 532 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY! linaar 

(ii) MOLECULE TXPE* protain 

(xi) SEQUENCE DESCRIPTION: SEQ ZD NOt 6: 

Mat Thr Gin Lau Tyr Ila Tyr Zla Arg Lau Lau Gly Ala Tyr Lau Pha - 
15 10 15 

Zla Ila Sar Arg Val Cln Cly Gin Am Lau Aap Sar Mat Lau Hia Gly 
20 25 30 
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Thr Gly Mat Lys Sir Asp Sar Asp cia Lya Lye Sar clu Asn CXy Ve*l 
35 40 45 

Thr Leu Ala Pro Clu Asp Thr Leu Pro Phs Leu Lye Cys Tyr Cys Sar 
50 55 60 

Gly His Cys Pro Asp Asp Ala lis Asa Asa Thr Cys lis Thr Asa Gly 

65 70 75 80 

His Cys Phs Ala lis lis Clu Clu Asp Asp Gin Gly Clu Thr Thr Leu 
85 90 95 

Ala Ssr Gly Cys Mat Lys Tyr Glu Gly Sar Asp Phs Gin Cys Lye Asp 

100 105 110 

Ser Pro Lys Ala Gla Leu Arg Arg Thr lis Glu Cys Cys Arg Thr Asa 
115 120 125 

Leu Cys Asa Gla Tyr Uu Gin Pro Thr Lau Pro Pre Val Val Zla Gly 
130 135 140 

Pro Phs Phs Asp Gly Sar Zla Arg Trp Lau Val Lau Lau Zla Sar Mat 

145 150 155 160 

Ala Val Cys Zla Zla Ala Mat Zla Zla Pha Sar Sar Cys Pha Cys Tyr 

165 170 175 

Lys His Tyr Cys Lys Ssr Zla Sar Sar Arg Arg Arg Tyr Asa Arg Asp 

180 1S5 190 

Leu Glu Gin Asp Glu Ala Phe Zla Pro Val Gly Glu Sar Lau Lys Asp 

195 200 205 

Lau Zla Xtp Gin Sar Gin Sar Sar Gly Sar Gly Sar Gly Lau Pro Lau 
210 215 220 

Lau Val Gin Arg Thr Zla Ala Lys Gla Zla Gin Mat Val Arg Gla Val 
225 230 235 240 

Gly Lys Gly Arg Tyr Gly Glu Val Trp Met Gly Lys Trp Arg Gly Clu 
245 250 255 

Lys Val Ala Val Lya Val Pha Pha Thr Thr Glu Clu Ala Sar Trp Pha 

260 265 270 

Arg Glu Thr Glu Zla Tyr Gin Thr Val Lau Mat Arg His Glu Asn Zla 
275 280 285 

Leu Gly Pha Zla Ala Ala Aap Zla Lys Gly Thr Gly Sar Trp Thr Gla 
290 295 300 

Lau Tyr Lau Zla Thr Asp Tyr His Glu Asn Gly Ssr Lau Tyr Asp Pha 
305 310 315 320 

Lau Lys Cys Ala Thr Lau Asp Thr Arg Ala Lau Leu Lys Lau Ala Tyr 
325 330 335 

Ssr Ala Ala Cys Gly Lau Cys His Lau Hia Thr Glu Zla Tyr Gly Thr 
340 345 350 
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Gin Cly Lyi Pro Ala lit Ala Hit Arg Aap Lau Lya far Lya Aan 21a 

355 360 365 

Lau Zla Lya Lya Aan Cly S r Cyi Cyi Xla Ala Aap Lau Cly X-au Ala 

370 375 380 

Val tya Pha Abo Sar Asp Thr Aan Clu Val Aap Val Pro Lau Aan Thr 
385 390 395 400 

Arg Val Cly Thr tya Arg Tyr Mat Ala Pro Clu Val Lau Atp Clu Sar 
405 410 415 

Lau Aan Lyt Aan Hia Pha Cln Pro Tyr Zla Nat Ala Atp Zla Tyr Bar 
420 425 430 

Pha Cly Lau Zla Zla Trp Clu Ktt Ala Arg Arg Cya Zla Thr Cly Cly 
435 440 445 

Zla Val Clu Clu Tyr Cln Lau Pro Tyr Tyr Am Mat Val Pro Sar Aap 
450 455 460 

Pro Sar Tyr Clu Atp Hat Arg Clu Val Val Cyt Val Lya Arg Lau Arg 
465 470 475 480 

Pro Zla Val Sar Atn Arg Trp Asn Sar Afp Clu Cya -Lau Arg Ala Val 
485 490 495 

Lau Lya Lau Mat Sar Clu Cyt Trp Ala Hit Atn Pro Ala Sar Arg Lau 

500 505 510 

Thr Ala Lau Arg Zla Lyt Lyt Thr Lau Ala Lya Mat Val Clu Sar Gin 

515 520 525 

Aap Val Lyt Zla 

530 

(2) INFORMATION FOR SEQ ZD NO: 7; 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH! 2333 but pairt 

(B) TYPE: nuclaic acid 

(C) STRANDEDNESS I unknown 

(D) TOPOLOGY j linaar 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(ill) ANTI -SENSE: NO 

(v) FRAGKENT TYPES intamtl 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo tapiant 

(lx) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATZON: 1..1515 
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(aci) SIQUZRCI DrSOUPTXONt 8BQ 10 MO I 7 s 

XTC CCC GAG TOG GCC GCA GCC TCC TCC TTC TTC CCC CTT CTT GTC CTC 48 
Met Ala Glu Ser Ala Gly Ala Ser Ser Phe Phe Pr Leu Val Val Leu 
1 5 10 15 

CTG CTC GCC GGC AGO GCC GGG TCC GGG CCC CGG CGG GTC CAG GCT CTC 96 
Leu Leu Ala Gly 5er Gly Gly Ser Gly Pro Arg Gly Val Gin Ala Leu 
20 25 30 

CTG TGT CCG TGC ACC ACC TGC CTC CAG GCC AAC TAC ACG TCT GAG ACA 144 
Leu Cye Ala Cya Thr Ser Cya Leu Gin Ala Aan Tyr Thr Cya Glu Thr 
35 40 45 

GAT GGG GCC TGC ATG GTT TCC TTT TTC AAT CTC CAT GGG ATG GAG CAC 192 
Asp Gly Ala Cya Met Val Ser Phe Phe Aan Leu Aap Gly Met Glu file 
50 55 60 

CAT GTG CGC ACC TGC ATC CCC AAA GTG GAG CTG GTC CCT GCC GGG AAG 240 
Hia Val Arg Thr Cya He Pro Lya Val Glu Leu Val Pro Ala Gly Lye 
65 70 75 80 

CCC TTC TAC TGC CTC AGC TCG GAG GAC CTG CGC AAC ACC CAC TCC TGC 288 
Pro Phe Tyr Cya Leu Ser Ser Glu Asp Leu Arg Aan Thr Hia Cya Cya 
85 90 95 

TAC ACT CAC TAC TGC AAC AGG ATC CAC TTG AGG GTG CCC AGT GGT CAC 336 
Tyr Thr Asp Tyr Cys Asn Arg He Aap Leu Arg Val Pro Ser Gly Hie 
100 105 110 

CTC AAG GAG CCT GAG CAC CCG TCC ATC TGG CGC CCC GTG CAG CTG CTA 384 
Leu Lya Glu Pro Glu His Pro Ser Met Trp Gly Pro Val Glu Leu Val 
115 120 i25 

GGC ATC ATC GCC GGC CCC GTG TTC CTC CTC TTC CTC ATC ATC ATC ATT 432 
Gly He He Ala Gly Pro Val Phe Leu Leu Phe Leu He He He He 
130 135 140 

CTT TTC CTT CTC ATT AAC TAT CAT CAG CGT CTC TAT CAC AAC CGC CAG 460 
Val Phe Leu Val He Asn Tyr His Gin Arg Val Tyr His Asn Arg Gin 
145 150 1S5 160 

AGA CTG CAC ATG GAA GAT CCC TCA TCT CAG ATG TCT CTC TCC AAA CAC 528 
Arg Leu Asp Met Glu Asp Pro Ser Cys Glu Met Cys Leu Ser Lys Asp 
165 170 175 

AAG ACG CTC CAG GAT CTT CTC TAC CAT CTC TCC ACC TCA GGG TCT CGC 576 
Lya Thr Leu Gin Asp Leu Val Tyr Asp Leu Ser Thr Ser Gly Ser Gly 
180 185 190 

TCA GGG TTA CCC CTC TTT CTC CAG CGC ACA CTG GCC CCA ACC ATC CTT 624 
Ser Gly Leu Pro Leu Phe Val Gin Arg Thr Val Ala Arg Thr He Val 
195 200 205 

TTA CAA GAG ATT ATT GGC AAG CGT CCG TTT GGG CAA CTA TGG CCG CGC 672 
Leu Gin Glu He He Gly Lys Gly Arg Phe Gly Clu Val Trp Arg Gly 
210 215 220 
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CCC TCG AGO OCT OCT GAT GTO OCT OTQ AAA ATA TTC TCT TCT OCT GAA 720 
Arg Trp Arg Gly OXy Asp Val Ala Val Lys lis Phs Ssr Ssr Arg Olu 
225 230 235 240 

GAA CCG TCT TGG TTC AGG GAA OCA GAG ATA TAC CAG ACS GTC ATG CTG 768 
Glu Arg Ssr Trp Phs Arg Clu Ala Glu lit Tyr Gin Thr Vsl Mst Lsu 
245 250 255 

CCC CAT CAA AAC ATC CTT GGA TTT ATT OCT GOT CAC AAT AAA CAT AAT 816 
Arg His Glu Asn Zls Leu Gly Phs Zls Ala Ala Asp Asn Lys Asp Asa 
260 265 270 

GGC ACC TGG ACA CAG CTG TGG CTT GTT TCT GAC TAT CAT GAG CAC GGG 664 
Gly Thr Trp Thr Gin Lsu Trp Lsu Val Ssr Asp Tyr Bis Clu Bis Gly 
275 280 285 

TCC CTG TTT GAT TAT CTG AAC CGG TAC ACA CTG ACA ATT CAG GGG ATG 912 
Ssr Lsu Phs Asp Tyr Lsu Asn Arg Tyr Thr Val Thr lis Glu Gly Mat 
290 295 300 

ATT AAG CTG GCC TTG TCT CCT OCT AGT GGG CTG GCA CAC CTG CAC ATG 960 
lis Lys Leu Ala Lsu Ssr Ala Ala Ssr Gly Lsu Ala Bis Lsu His Mst 
305 310 315 320 

CAG ATC CTG GGC ACC CAA GGG AAG CCT GGA ATT CCT CAT CCA GAC TTA 1008 
Glu lis Val Gly Thr Gin Gly Lys Pro Gly lis Ala Bis Arg Asp Lsu 
325 330 335 

AAG TCA AAC AAC ATT CTG CTG AAG AAA AAT GGC ATG TGT GCC ATA GCA 1056 
Lys ser Lys Asn Zls Lsu Val Lys Lys Asn Gly Kst Cys Ala lis Ala 
340 345 350 

CAC CTG GGC CTG CCT GTC CCT CAT CAT GCA GTC ACT CAC ACC ATT CAC 1104 
Asp Lsu Gly Lsu Ala Vsl Arg Bis Asp Ala Val Thr Asp Thr Zls Asp 
355 360 365 

ATT CCC CCC AAT CAG AGG CTG GGG ACC AAA CCA TAC ATG GCC CCT CAA 1152 
lis Ala Pro Asn Gin Arg Val Gly Thr Lys Arg Tyr Kst Ala Pro Glu 
370 375 380 

CTA CTT CAT GAA ACC ATT AAT ATG AAA CAC TTT CAC TCC TTT AAA TCT 1200 
Val Leu Asp Glu Thr lis Asn Mst Lys His Phs Asp Ssr Phs Lys Cys 
385 390 395 400 

GCT GAT ATT TAT GCC CTC GGG CTT GTA TAT TGG CAG ATT CCT CCA AGA 1248 
Als Asp lis Tyr Ala Lsu Gly Lsu Val Tyr Trp Glu lis Ala Arg Arg 
405 410 415 

TGC AAT TCT GGA GGA CTC CAT CAA CAA TAT CAG CTG CCA TAT TAC GAC ; . 1296 
Cys Asn Ssr Gly Gly Val His Glu Glu Tyr Gin Lsu Pro Tyr Tyr Asp 
420 425 430 

TTA GTG CCC TCT GAC CCT TCC ATT GAG GAA ATC CCA AAG CTT GTA TCT 1344 
Leu Val Pro Ser Asp Pro Ser Zls Glu Glu Kst Arg Lys Val Val Cys 
435 440 445 

CAT CAC AAG CTG CCT CCC AAC ATC CCC AAC TGG TGG CAG ACT TAT CAG 1392 
Asp Gin Lys Lsu Arg Pro Asn Zls Pro Asn Trp Trp Cln Ssr Tyr Glu 
450 455 460 
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CCA CTG CCC CTC ATC CGG AAG ATC ATC OCA CAC TCT TCC TAT GCC AAC 1440 
Ala Leu Arg Val Met Gly Lya Mat Mat Arg Clu Cye Trp Tyr Ala Aan 
465 470 475 480 

GCC CCA CCC CCC CTC AOS CCC CTC CCC ATC AAG AAG ACC CTC TCC CAC 1486 
Gly Ala Ala Arg Leu Thr Ala Leu Arg II Lya Lye Thr Leu Ser Gin 
485 490 495 

CTC ACC CTC CAC GAA CAC GTC AAG ATC TAACTGCTCC CTCTCTCCAC 1535 
Leu Ser Val Gin Glu Asp Val Lya He 
500 505 



ACGGACCTCC 


TCCCAGOGAC 


AACTACCCAC ACCTCCCGCC TTGACCCTAC CATGGAGGCC 


1595 


TACCTCTCGT 


TTCTGCCCAG 


CCCTCTCTCG CCACCAGCCC TCCCCCCCAA GAGGGACAGA 


1655 


GCCCCGCAGA 


GACTCCCTCA 


CTCCCATCTT GGGTTTGAGA CAGACACCTT TTCTATTTAC 


1715 


CTCCTAATCG 


CATCGACACT 


CTCAGAGCGA ATTCTCTCGA CAACTCAGTG CCACACCTCC 


1775 


AACTGGTTCT 


AGTCGGAAGT 


CCCCOSAAAC CCGGTCCATC TGGCACCTGG CCAGGAGCCA 


1635 


TCACACCGCC 


CCTTCGGAGG 


GGCCGGAGGA ACCCAGGTGT TCCCACTCCT AAGCTGCCCT 


1895 


CAGGGTTTCC 


TTCCGGGACC 


AGCCCACAGC ACACCAAGGT GGCCCGGAAG AACCAGAAGT 


1955 


CCACCCCCTC 


TCACACGCAC 


CTCTCAGCCC CCCTTTCCCC TCCTCCCTCC GATCCAOGCT 


2015 


GCCGGCAGAC 


TCCCAGTGGA 


GACGGAATCT CCCCCTTTCT CTCTCCACCC CTCTCTGCAT 


2075 


CTCCCCAGCT 


GCCTCCCCCC 


TTCTCCCTCG TTCCTCCCAT GCCCTTACAC GTGCCTGTCA 


2135 


CTCTCTCTCT 


CTCTCTCTAC 


CTCCCCACTT ACCTCCTTCA CCTTTCTCTC CATCTCCACG 


2195 


TCGCGGCTGT 


CGTCCTCATC 


CTCTCCGTCC TTCCTGGTCC CTCTTTTCAC TACTGAGCAC 


2255 


CATCTACTTT 


CCCTOGTCCC 


CTTCCCTCGA CGTCTCTCCC TCCCCCACAC CCCCTCATCC 


2315 


CACACTGCTA 


CT CTC TCT 




2333 



(2) INFORMATION POR SEQ ID NOt 8: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 505 amino acids 

(B) TYPE* amino acid 
(D) TOPOLOCr: linear 

(ii) HOLBCULE TCttt protain 

(xi) SEQUENCE DESCRIPTION t SEQ ID NOt 8t 

Met Ala Glu Str Ala Gly Ala Ser Ser Phe Pha Pro Lau Val Val Leu 
1 5 10 15 

Leu Leu Ala Gly Ser Gly Gly Ser Gly Pro Arg Gly Val Gin Ala Leu 
20 25 30 
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Leu Cya Ala Cya Thr Ser Cya Lau Gin Ala Aan Tyr Thr Cya Ola Thr 
35 40 45 

Asp Cly Ala Cya Hat Val Ser Phe Phe Asn Uu Aap Cly Mat Glu fiia 

50 55 60 

Hi* Val Arg Thr Cya Zla Pro Lya Val Glu Leu Val Pro Ala Cly Lya 

65 70 75 BO 

Pro Pha Tyr Cya Lau Sar Sar Glu Aap Lau Arg Aan Thr Hia Cya Cya 
65 90 95 

Tyr Thr Aap Tyr Cya Aan Arg Zla Aap Lau Arg Val Pro Sar Gly Hia 
100 105 110 

Lau Lya Glu Pro Glu Bis Pro Sar Mat Trp Gly Pro Val Glu Lau Val 
115 120 125 

Gly Zla Zla Ala Gly Pro Val Pha Lau Lau Pha Lau Zla Zla Zla Zla 

130 135 140 

Val Phe Lau Val Zla Atn Tyr Hia Gin Arg Val Tyr Hia Aan Arg Gin 
145 150 155 160 

Arg Leu Aap Mat Glu Aip Pro Sar Cya Glu Mat Cya Lau Sar Lya Aap 

165 170 175 

Lya Thr Lou Gin Aap Lau Val Tyr Asp Lau Sar Thr Sar Cly Sar Gly 

180 185 190 

Sar Gly Lau Pro Lau Phe Val Gin Arg Thr Val Ala Arg Thr Zla Val 

195 200 205 

Lau Gin Glu Zla Zla Gly Lya Gly Arg Pha Gly Glu Val Trp Arg Gly 
210 215 220 

Arg Trp Arg Cly Gly Aap Val Ala Val Lya Zla Pha Sar Sar Arg Glu 

225 230 235 240 

Glu Arg Ser Trp Pha Arg Glu Ala Clu Zla Tyr Gin Thr Val Mat Lau 
245 250 255 

Arg Hia Glu Am Zla Lau Gly Pha Zla Ala Ala Aip Aan Lya Aap Aan 

260 265 270 

Gly Thr Trp Thr Gin Lau Trp Lau Vel Sar A«p Tyr Hia Glu Bia Gly 
275 280 285 

Ser Leu Pha Aip Tyr Lau Aen Arg Tyr Thr Val Thr Zla Glu Gly Mat 
290 295 300 

Zla Lya Leu Ala Lau Sar Ala Ala Sar Gly Lau Ala Hia Lau Hia Mat 

305 310 315 320 

Glu Zla Val Gly Thr Gin Gly Lya Pro Gly Zla Ala Hia Arg Aap Lau 
325 330 335 

Lya Ser Lya Aan Zla Lau Val Lya Lya Aan Gly Met Cya Ala Zla Ala 

340 345 350 
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Asp Leu Cly Leu Ala Val Arg Bis Asp Ala Val Thr Asp Thr 21 Asp 

360 355 

^in Pr ° A " Gln Val cl * Thr *Y* **t Ala Pro Olu 

370 375 , 3B0 

Val Leu Asp Clu Thr II; Aen Mat Lye His Pha Aap 6ar Pha Lya Cys 

385 390 395 4 £ 0 

Ala Asp Ila Tyr Ala Lau Cly Lau Val Tyr Trp Clu Zla Ala Arg Arc 
405 410 41 | 

Cys Asn Sar Cly Cly Val His Clu Clu Tyr Cln Lau Pro Tyr Tyr Asp 
420 425 430 

Lau Val Pro Sar Asp Pro Sar Ila Olu Clu Mat Arg Lys Val Val Cys 
435 440 445 

Asp Cln Lys Lau Arg Pro Asn Ila Pro Aan Trp Trp Cln Sar Tyr Clu 
«50 455 460 

Ala Lau Arg Val Mat Cly Lys Mat Mat Arg Clu Cys Trp Tyr Ala Asn 
465 470 475 480 

Cly Ala Ala Arg Lau Thr Ala Lau Arg Ila Lya Lys Thr Lau Sar Cln 
485 490 495 

Leu Ser Val Cln Clu Asp Val Lys Ila 

500 505 

(2) INFORMATION FOR SEQ ID NOt 9: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 2308 basa pairs 

(B) TYPE: nuclaic acid 

(C) 5TRANDEDNESS : unknown 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDKA 

(iii) HYPOTHETICAL! NO 

(iii) ANTI-SENSE J MO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(ix) FEATURE: 

(A) NAME /KEY: COS 

(B) LOCATION: 77.. 1585 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 9: 
GGCGAGGCGA CGTTTCCTCC CCTCACCCAC CCGCGCCGCC CCCCCCCGCC CCCCCACACC 
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CGCTCCCCCC CSCACC ATO CAO COS OCO CTC CCT CCT CCO OCT CCC CCC 109 
Mat Clu Ala Ala Val Ala Ala Pr Arg Pro Arg 
IS 10 



CTC 
Leu 


CTw 

Leu 


CTC 
Leu 


CTC CTC 
Leu Val 

15 


Leu 


CCS 
WW** 

Ala 


CCC 

Ala 


GGG CCC 

Ala Ala 

20 


coc 
Ala 


WW 

Ala 


ftftfi CCC CCC CTC 
VfWW wW^a ar^»*a * 

Ala Ala Ala Lau 

25 


157 


CTC 

Lou 


cu» 
Pro 


GGG CCC ACC 
Gly Ala Thr 
30 


WWW 

Ala 


TTA 
A A A 

Leu 


WAV* 

Cln 
35 


TCT TTC 
Cys Phe 


Tee 
Cya 


pip 
Hia 


CTC TCT ACA AAA 

U1W *Wi #%^«l> «WW* 

Lau Cya Thr Lya 
40 


205 


GAC 
Aap 


AAT 
A0D 

45 


TTT ACT TGT 
Phe Thr Cys 


CTC 

Val 


ACA 

Thr 
SO 


WAT 

a«p 


COG CTC 
Cly Leu 


TGU 

cya 


TTT 

Pha 

55 


CTC TCT CAW AWA 

Val Ser Vsl Thr 




GAC 
Glu 
60 


ACC 
Thr 


ACA GAC AAA 
Thr Asp Lys 


CTT 

Vsl 
65 


ATA 

He 


#tap 
CAC 

His 


AAC AGC 
Asn Ser 


AAV 

Met 

70 


TCT 

Cya 


if t rrt CAA ATT 
ATA CCT CAA AA A 

He Ala Clu He 

75 


JUA 


CAC 

Asp 


TTA 

Leu 


ATT CCT CCA 
He Pro Arg 
80 


GAT 

Asp 


Arg 


Pro 


TTT 
Phe 


GTA 

val 
85 


TCT 

Cya 


fir^A 

GUA 

Ala 


rnp f r*r TCA AAA 

WWW AWl 1WI AAA 

Pro Ser ser Lys 
90 




ACT 
Thr 


COG 
Gly 


TCT 
ser 


CTC 
Vsl 
95 


ACT 
Thr 


IP! 
ACA 

Thr 


ACA 

Thr 


TAT 

Tyr 


TGC 
Cys 

100 


TGC 
Cys 


HI* 
AAT 

Asn 


CAC 

Gin 


C*.^ riT TCC 11T 
CAW CAA AWW AAA 

Aap Hia Cya Aan 

105 




AAA 
Lys 


ATA 
lie 


GAA 
Glu 
110 


CTT 
Leu 


CCA 
Pro 


ACT 

Thr 


ACT 

Thr 


GTA 

Val 
US 


AAG TCA 
Lys Ser 


TCA 

Ser 


CCT 

Pro 


GGC CTT GGT CCT 
Gly Leu Gly Pro 
120 


lie 
445 


CTC 
Vsl 


GAA 
Glu 
125 


CTC 
Leu 


CCA 
Al* 


CCT 
Ala 


CTC 
Vsl 


ATT 

He 

130 


CCT 
Als 


CCA CCA 
Cly Pro 


CTw 

Val 


TCC 

Cya 

135 


tmmkj* tmfmm %mr* 

aaw CTw TCC ATC 

Phe Val Cys He 


433 


TCA 
Ser 
140 


CTC 

Leu 


ATG 
Met 


TTC 
Leu 


ATG 
Met 


VilW 

Vsl 
145 


1A1 

Tyr 


He 


TGC 
Cys 


CAC 
His 


xxr 

AAw 

Asn 

150 


CCC 

Arg 


ACT CTC ATT CAH 
AwA WA W AA A WAW 

Thr Val Xla Hia 
155 


at 11 


CAT 
His 


CCA 
Arg 


CTC 
Vsl 


CCA AAT 
Pro Asa 
160 


vAA 

Glu 


GlU 


GAC 

Aap 


CCT 
Pro 


TCA 
Ser 
165 


TTA 

Leu 


CAT 

Aap 


CCC CCT TTT ATT . 
Arg Pro Phe He 
170 


con 


TCA 
Ser 


GAG 
Glu 


GGT ACT ACC 
Gly Thr Thr 
175 


TTC 
Leu 


AAA 
Lys 


CAC 
Aap 


TTA 
Leu 
180 


ATT 
He 


TAT 
Tyr 


CAT 
Aap 


ATG ACA ACC TCA 
Met Thr Thr Ser 
185 


637 


CCT 
Cly 


TCT 
Ser 


GCC TCA GGT 
Gly Ser Gly 
190 


TTA 
Leu 


CCA 
Pro 


TTC 

Leu 

195 


CTT 
Leu 


CTT 
Val 


CAC 
Cln 


ACA 
Arg 


ACA ATT GCC AGA 
Thr He Ala Arg 
200 


485 


ACT 
Thr 


ATT 
lie 
205 


CTC TTA CAA 
Vil Leu Gin 


GAA 
Glu 


AGC 
Ser 
210 


ATT 
lie 


CCC AAA 
Gly Lys 


GGT 
Gly 


CCA 
Arg 
21S 


TTT CCA GAA CTT 
Phe Gly Glu Val 


733 



TCC ACA CCA AAC TCC CCC CCA CAA CAA CTT CCT CTT AAC ATA TTC TCC 
Trp Arg Cly Lya Trp Arg Cly Glu Clu Val Ala Val Lya Xla Pha Sar 
220 225 230 235 



781 
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TCT AGA GAA GAA OCT TCC TOG TTC CCT CAC GCA GAG XTT TXT CM ACT 829 
Ser Arg Clu Clu Arg Ser Trp Phe Arg Qlu Ala Clu lit Tyr Gin Thr 
240 245 250 

TA ATC TTA CCT CAT CAA AAC ATC CTC CCA TTT ATA CCA CCA CAC AAT 877 
Val Met Leu Arg His Clu Aan He Leu Cly Phe He Ala Ala Asp Aan 

255 260 265 

AAA CAC AAT CCT ACT TOG ACT CAC CTC TCC TTC CTC TCA GAT TAT CAT 925 
Lya Asp Aan Cly Thr Trp Thr Gin Leu Trp Leu Val Sar Aap Tyr Hit 
270 275 280 

GAG CAT CCA TCC CTT TTT GAT TAC TTA AAC ACA TAC ACA GTT ACT CTC 973 
Glu His Gly Sar Leu Pha Asp Tyr Leu Am Arg Tyr Thr Val Thr Val 
285 290 295 

CAA GCA ATC ATA AAA CTT GCT CTC TCC ACG CCG ACC CCT CTT GCC CAT 1021 
Glu Gly Hat Zla Lya Leu Ala Leu Ser Thr Ala Sar Gly Leu Ala Hie 

300 305 310 315 

CTT CAC ATC GAG ATT CTT OCT ACC CAA GCA AAG CCA GCC ATT GCT CAT 1069 
Leu His Met Clu lie Val Gly Thr Gin Cly Lya Pro Ala lie Ala Hie 
320 325 330 

ACA GAT TTC AAA TCA AAG AAT ATC TTC CTA AAG AAG AAT GGA ACT TCC 1117 
Arg Asp Leu Lye Ser Lya Asn lie Leu Val Lya Lya Aan Gly Thr Cys 
335 340 345 

TCT ATT CCA CAC TTA GCA CTC CCA CTA ACA CAT GAT TCA GCC ACA GAT 1165 
Cva He Ala Asp Leu Gly Leu Ala Val Arg Hie Aap Ser Ala Thr Aap 
350 355 360 

ACC ATT CAT ATT CCT CCA AAC CAC AGA CTC GGA ACA AAA AGG TAC ATC 1213 
Thr He Asp He Ala Pro Aan His Arg Val Gly Thr Lya Arg Tyr Met 
365 370 375 

GCC CCT CAA CTT CTC CAT GAT TCC ATA AAT ATC AAA CAT TTT GAA TCC 1261 
Ala Pro Glu Val Leu Aap Asp Sar lie Asn Met Lya Bia Phe Glu Ser 
380 385 390 395 

TTC AAA CCT CCT GAC ATC TAT CCA ATC GCC TTA GTA TTC TCG GAA ATT 1309 
Phe Lys Arg Ala Aap He Tyr Ala Met Gly Leu Val Phe Trp Glu He 
400 405 410 

CCT CCA CCA TCT TCC ATT GGT GCA ATT CAT CAA CAT TAC CAA CTC CCT 13 S7 

Ala Arg Arg Cys Ser He Cly Gly He His Glu Aap Tyr Gin Leu Pro 
415 420 425 

TAT TAT CAT CTT CTA CCT TCT CAC CCA TCA CTT CAA GAA ATG ACA AAA 1405 
Tyr Tyr Asp Leu Val Pro Ser Asp Pro Ser Val Glu Glu Met Arg Lya 
430 435 440 

CTT CTT TCT CAA CAG AAG TTA ACC CCA AAT ATC CCA AAC AGA TCC CAC 1453 
Val Val Cys Glu Gin Lya Leu Arg Pro Asn He Pro Aan Arg Trp Gin 
445 450 455 

ACC TCT CAA GCC TTC ACA CTA ATG CCT AAA ATT ATC ACA GAA TCT TCG 1501 
Ser Cvs Glu Ala Leu Arg Val Met Ala Lys He Met Arg Glu Cys Trp 
460 " 465 470 475 



ffinprmiTP rtirrr 



WO 94/11502 PCT/GB93/02367 

( . 58 ' . 

TAT CCC AAT CCA OCA OCT AGO CTT ACA OCA TTC COO ATT AM AAA ACA IS 4 9 

Tyr Ala Aan Cly Alt Ala Arg Lau Thr Ala Leu Arg lit Lya Lya Thr 
460 485 490 

TTA TCC CAA CTC ACT CAA CAC CAA CCC ATC AAA ATC TAATTCTACA 1595 
Lau Sar Gin Leu Sar Gin Gin Glu Gly Ha Lya Mat 
495 500 

C CTT T O C C T G AACTCTCCTT TTTTCTTCAC ATCTGCTCCT GCGTTTTAAT TTGOCAGCTC 1655 

ACTTCTTCTA CCTCACTGAC ACCCAACACA ACCATATTGC TTCCTITTOC ACCACTGTAA 1715 

TAAACTCAAT TAAAAACTTC CCACCATTTC TTTGGACCCA GCAAACACCC ATCTGCCTCC 1775 

rrrCTCTGCA CTATGAACCC TTCTT T CC C A GCACACAAAA TCTCTACTCT ACCTTTATTT 1835 

TTTATTAACA AAA CTTCTTT TTTAAAAAGA TGATT G CT G G TCTTAACTTT ACCTAACTCT 1895 

CCTCTCCTCC ACATCATCTT TAACOOCAAA CCACTTCCAT TCCTCAATTA CAATGAAACA 1955 

TGTCTTATTA C7AAAGAAAC TCATTTACTC CTCGTTACTA CATTCTCACA CCATTCTGAA 2015 

CCACTACACT TTCCTTCATT CACACTTTCA ATCTACTCTT CTATAGTTTT TCACCATCTT 2075 

AAAACTAACA CTTATAAAAC TCTTATCTTC AGTCTAAAAA TCACCTCATA TACTACTCAC 2135 , 

CAACATAATT CATGCAATTG TATTTTGTAT ACTATTATTG TTCTTTCACT TATTCACAAC 2195 

ATT A CATC CC TTCAAAATCC GATTCTACTA TACCACTAAC TCCCACTTCT CTCTCTTTCT 2255 

AATGGAAATC ACTACAATTC CTCAAACTCT CTATGTTAAA ACCTATACTC TTT 2308 

(2) INFORMATION MR 5SQ ID NO: lOl 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 503 amino acide 

(B) TYPE* amino acid 
(O) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: protain 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 10: 

Met Glu Ala Ala Val Ala Ala Pro Arg Pro Arg Lau Lau Lau Lau Val 

1 5 10 15 

Leu Ala Ala Ala Ala Ala Ala Ala Ala Ala Lau Lau Pro Cly Ala Thr 

20 25 30 ... 

Ala Leu Gin Cya Pha Cya Hia Lau Cya Thr Lya Aap Aan Pha Thr Cya 
35 40 45 

Val Thr Aap Gly Lau Cya Phm Val Sar Val Thr Glu Thr Thr Aap Lya 
50 55 60 

Val He Hia Afn Ser Mat Cya Ha Ala Glu Ha Asp Lau Ha Pro Arg 
65 70 75 80 
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Aap Aro Pro Phe Val Cya Ala Pro Ser Ser Lya Thr Oly Sar Val Thr 
85 90 95 

Thr Thr Tyr Cyt Cya Aan Gin Aap Hia Cya Aan Lya Zla Clu Lau Pro 

100 105 110 

Thr Thr Val Lya Sar Sar Pro Cly Lau Cly Pro Val Clu Lau Ala Ala 
115 120 125 

Val Zla Ala Cly Pro Val Cya Pha Val Cya Zla Sar Lau Mat Lau Hat 

130 135 140 

Val Tyr Zla Cya Hia Aan Arg Thr Val Zla Bia Hia Arg Val Pro Aan 

145 150 155 160 

Clu Clu Aap Pro Sar Lau Aap Arg Pro Pha Zla Sar Clu Cly Thr Thr 
165 170 175 

Leu Lya Aap Lau Zla Tyr Aap Mat Thr Thr Sar Cly Sar Cly Sar Cly 

180 185 190 

Leu Pro Leu Leu Val Cln Arg Thr Zla Ala Arg Thr Zla Val Lau Gin 
195 200 205 

Glu Ser Zla Gly Lya Gly Arg Pha Gly Clu Val Trp Arg Cly Lya Trp 
210 * 215 220 

Arg Gly Glu Glu Val Ala Val Lya Zla Pha Sar Sar Arg Glu Clu Arg 
225 230 235 240 

Ser Trp Phe Arg Clu Ala Glu Zle Tyr Cln Thr, Val Mat Lau Arg Hia 
245 250 255 

Clu Aan Zla Lau Gly Pha Zla Ala Ala Aap Aan Lya Aap Aan Gly Thr 

260 265 270 

Trp Thr Gin Leu Trp Leu Val Ser Aap Tyr Hia Clu Hia Cly Ser Lau 
275 280 285 

Phe Aap Tyr Leu Aan Arg Tyr Thr Val Thr Val Clu Cly Mat Zla Lya 
290 295 300 

Leu Ala Leu Ser Thr Ala Sar Gly Leu Ala Hia Lau Hia Mat Clu Zla 

305 310 315 320 

Val Cly Thr Gin Gly Lya Pro Ala Zla Ala Hia Arg Aap Leu Lya Ser 
325 330 335 

Lya Aan Zla Leu Val Lya Lya Aan Cly Thr Cya Cya lie Ala Aap Lau 
340 345 350 

Cly Leu Ala Val Arg Hia Aap Ser Ala Thr Aap Thr Zla Aap Zla Ala 
355 360 365 

Pro Aan Hia Arg Val Cly Thr Lya Arg Tyr Mat Ala Pro Clu Val Lau 
370 375 380 

Aap Aap Ser Zle Aan Mat Lya Hia Phe Clu Ser Pha Lya Arg Ala Aap 

385 390 395 400 
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Xle Tyr Ala Ntt Gly Leu Val Phe Trp Clu 21* Ala Arg Arg Cys Mr 
40S 410 415 

lie Gly ly Xla His lu Asp Tyr Gin Leu Pr Tyr Tyr Asp Lau Val 
420 425 430 

Pro Ser Atp Pro Sar Val Glu Clu Nat Arg Lyi Val Val Cys Glu Gin 
435 440 445 

Lya Lau Arg Pro Aan Ila Pro Aan Arg Trp Gin Sar Cys Glu Ala Lau 
450 455 450 

Arg Val Met Ala Lys Xla Mat Arg Glu Cys Trp Tyr Ala Asn Gly Ala 
465 470 475 480 

Ala Arg Leu Thr Ala Lau Arg Xla Lys Lys Thr Lau Sar Gin Lau Sar 
485 490 495 

Gin Gin Glu Gly Xla Lys Met 

500 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH i 1922 base pairs 

(B) TYPE i nucleic acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGI: linear 

<ii) MOLECULE TTPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

<v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouse 

(ix) FEATURE: 

(A) NAME /KEY: COS 

(B) LOCATION: 241.. 1746 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GAGAGCACAG CCCTTCCCAG TCCCCGCAGC CCCCGCGCCA CGCGCGCATG ATCAACACCT . 60 

TTTCCCCGGC CCCACAGGGC CTCTGCAGGT GAGACCCCGG CCGCCTCCGC AAGGAGAGGC 120 

GGGGGTCGAG TCGCCCTGTC CAAAGGCCTC AATCTAAACA ATCTTGATTC CTGTTGCCGG 180 

CTGGCGGGAC CCTGAATGGC AGGAAATCTC ACCACATCTC TTCTCCTATC TCCAAGGACC 240 

A TO ACC TTG GCG ACC TTC AGA AGG GGC CTT TIG ATG CTG TCG GTG GCC 288 
Met Thr Leu Gly Sar Phe Arg Arg Gly Lau Leu Met Lau Sar Val Ala 
15 10 15 
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TTG GCC CTA ACC CAO GGG AGA CTT CCG AAO CCT TCC AAG CTC CTG AAC 336 
Leu Ciy Leu Thr Gin Cly Arg Leu Alt Lys Pro Ssr Lys Leu Val Asn 
20 25 30 

TOC ACT TCT CAG AGC CCA CAC TCC AAG AGA CCA TTC TGC CAG CGG TCA 384 
Cys Thr Cys Glu Ser Pr His Cys Lys Arg Pro Phs Cys Gin Gly Ssr 
35 40 45 

TGC TGC ACA GTG CTG CTG GTT CGA GAG CAG GGC AGG CAC CCC CAG GTC 432 
Trp Cys Thr Val Val Leu Val Arg Glu Gin Gly Arg His Pro Gin Val 
50 55 60 

TAT CCC GGC TOT GGG AGC CTG AAC CAG GAG CTC TCC TTG GGA CGT CCC 480 
Tyr Arg Gly Cys Cly Ssr Leu Asn Gin Glu Leu Cys Leu Cly Arg Pro 
65 70 75 80 

ACG GAG TTT CTC AAC, CAT CAC TGC TGC TAT AGA TCC TTC TCC AAC CAC 528 
Thr Glu Phs Lsu Asn His Hio Cys Cys Tyr Arg Ser Phs Cys Asn His 
85 90 95 

AAC CTG TCT CTG ATG CTG GAG GCC ACC CAA ACT CCT TOG GAG CAG CCA 576 
Asn Val Ser Lsu Met Leu Glu Ala Thr Gin Thr Pro Ssr Glu Glu Pro 
100 105 110 

GAA CTT CAT GCC CAT CTG CCT CTG ATC CTC CCT CCT CTG CTC GCC TTG 624 
Glu Val Asp Ala His Leu Pro Leu Zle Leu Gly Pro Val Leu Ala Leu 
115 120 125 

CCG CTC CTG GTG GCC CTG GOT GCT CTG GGC TTG TGG CGT CTC CGG CCG 672 
Pro Val Leu Val Ala Leu Gly Ala Leu Gly Leu Trp Arg Val Arg Arg 
130 135 140 

AGG CAG GAG AAG CAG CGG CAT TTG CAC ACT GAC CTG GGC GAG TCC AGT 720 
Arg Gin Glu Lys Gin Arg Asp Leu His Ser Asp Lsu Gly Glu Ser Ser 
145 150 155 160 

CTC ATC CTC AAG OCA TCT CAA CAG GCA CAC AGC ATG TTG GGC CAC TTC 768 
Leu lie Leu Lys Ala Ser Glu Gin Ala Asp Ser Met Lsu Gly Asp Phe 
165 170 175 

CTG GAC AGC CAC TCT ACC ACG GGC AGC GGC TCG GGG CTC CCC TTC TTC 816 
Leu Asp Ser Asp Cys Thr Thr Cly Ser Gly Ser Cly Leu Pro Phe Leu 
180 185 190 

GTG CAC ACG ACC CTA CCT CCG CAG CTT CCC CTG GTA CAC TCT CTC GGA 664 
Val Cln Arg Thr Val Ala Arg Gin Val Ala Leu Val Glu Cys Val Cly 
195 200 205 

AAG CGC CGA TAT GGC CAG CTC TCG CCC CGT TCG TGG CAT CGC CAA AGC " 912 
Lys Cly Arg Tyr Gly Glu Val Trp Arg Ciy Ser Trp His Gly Clu Ser 
210 215 220 

CTC CCC CTC AAC ATT TTC TCC TCA CCA CAT CAG CAG TCC TCC TTC CCG 960 
Val Ala Val Lys He Phe Ser Ser Arg Asp Glu Gin Ser Trp Phe Arg 
225 230 235 ~ 240 



GAG ACG CAG ATC TAC AAC ACA CTT CTG CTT AGA CAC GAC AAC ATC CTA 
Glu Thr Glu He Tyr Asn Thr Val Leu Leu Arg His Asp Asn He Leu 
245 250 255 
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GGC TTC ATC CCC TCC CAC ATG ACT TCO COO AAC TOO ACC ACC CAO CTC 1056 
Cly Pha lie Ala Ser Asp Met Thr Ser Arg Aan Ser Ser Thr Oin Leu 
260 265 270 

TCG CTC ATC ACC CAC TAC CAT CAA CAC CCC TCC CTC TAT CAC TTT CTC 1104 
Trp Lou lit Thr Hia Tyr Bia Glu Hia Gly Ser Leu Tyr Aap Pha Lau 
275 280 - 285 

CAG AGG CAG ACG CTO GAG CCC CAG TTG CCC CTC AGG CTA OCT GTC TCC 1152 
Gin Arg Gin Thr Lau Glu Pro Gin Lau Ala Lau Arg Lau Ala Val Sar 
290 295 300 

CCG CCC TCC GGC CTG CCG CAC CTA CAT CTG CAG ATC TTT GGC ACT CAA 1200 
Pro Ala Cya Gly Lau Ala Bia Lau Hia Val Glu I la Pha Gly Thr Gin 
305 310 315 320 

GGC AAA CCA GCC ATT CCC CAT CCT GAC CTC AAG ACT CCC AAT GTC CTG 1248 
Gly Lya Pro Ala 21a Ala Bia Arg Aap Lau Lya Sar Arg Aan Val Leu 
325 330 335 

GTC AAG ACT AAC TTC CAG TCT TCC ATT CCA GAC CTG GGA CTG GCT CTC 1296 
Val Lya Sar Aan Lau Gin Cya Cya Ila Ala Aap Lau Cly Lau Ala Val 
340 345 350 

ATG CAC TCA CAA AGC AAC GAG TAC CTG GAT ATC GGC AAC ACA CCC CCA 1344 
Met Hia Sar Gin Sar Aan Glu Tyr Lau Aap Ila Gly Am Thr Pro Arg 
355 360 365 

CTG CCT ACC AAA ACA TAC ATG CCA CCC CAG CTC CTG CAT GAG CAC ATC 1392 
Val Cly Thr Lya Arg Tyr Hat Ala Pro Clu Val Lau Aap Glu Hia Zla 
370 375 380 

CCC ACA CAC TCC TTT CAG TCC TAC AAG TCG ACA GAC ATC TGG CCC TTT 1440 
Arg Thr Aap Cya Pha Glu Sar Tyr Lya Trp Thr Aap Zla Trp Ala Pha 
385 390 395 400 

CCC CTA CTC CTA TCC CAG ATC CCC CCC CCG ACC ATC ATC AAT GCC ATT 1488 
Gly Lau Val Lau Trp Clu Zla Ala Arg Arg Thr Zlo Zla Aan Cly Zla 
405 410 415 

CTG CAG CAT TAC AGG CCA CCT TTC TAT CAC ATC CTA CCC AAT CAC CCC 1536 
Val Glu Aap Tyr Arg Pro Pro Pha Tyr Aap Met Val Pro Aan Aap Pro 
420 425 430 

AGT TTT GAG CAC ATC AAA AAG CTC CTG TCC CTT CAC CAG CAG ACA CCC 1584 
Ser Pha Clu Aap Mat Lya Lya Val Val Cya Val Aap Gin Gin Thr Pro 
435 440 445 

ACC ATC CCT AAC CGC CTC CCT CCA CAT CCC CTC CTC TCC GGG CTG GCC 1632 
Thr Zla Pro Am Arg Leu Ala Ala Aap Pro Val Lau Sar Gly Leu Ala 
450 455 460 

CAG ATG ATG ACA GAG TGC TCC TAC CCC AAC CCC TCT CCT CCC CTC ACC 1680 
Gin Met Met Arg Glu Cya Trp Tyr Pro Aan Pro Ser Ala Arg Lau Thr 
465 470 475 4 BO 

CCA CTC CCC ATA AAG AAG ACA TTC CAG AAG CTC AGT CAC AAT CCA GAG 1728 
Ala Leu Arg Zle Lya Lya Thr Leu Gin Lya Leu Ser Hia Aan Pro Glu 
485 490 495 
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AAG CCC AAA CTO ATT CAC TAGCCCACGG CCACCACCCT TCCTCTCCCT 1776 
Lys Pro Lys Val lie Bis 
500 

AAAGTGTCTG CTGGGGAACA AGACATAGCC TCTCTGGGTA GAGGGAGTGA AGAGAGTGTG 1836 

CACGCTGCCC TGTCTGTGCC TGCTCAGCTT GCTCCCAGCC CATCCAGCCA AAAATACAGC 1896 

TGACCTGAAA TTCAAAAAAA AAAAAA 2922 

(2) INFORMATION POR SEQ ID NO: 12 1 

(1) SEQUENCE CHARACTERISTICS t 

(A) LENGTH: 502 amino acids 

(B) TYPE: a&ino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Thr Lau Cly Sar Phe Arg Arg Gly Leu* Lau Mat Lau Sar Val Ala 

15 10 15 

Leu Gly Lau Thr Gin Gly Arg Lau Ala Lya Pro Sar Lys Lau Val Asn 

20 25 30 

Cys Thr Cys Glu Ser Pro His Cys Lys Arg Pro Phe Cys Gin Gly Ser 
35 40 45 

Trp Cys Thr Val Val Lau Val Arg Glu Gin Gly Arg His Pro Gin Val 
50 55 60 

Tyr Arg Gly Cys Cly Ser Lau Asn Gin Glu Lau Cys Lau Cly Arg Pro 
65 70 75 80 

Thr Glu Phe Leu Asn His His Cys Cys Tyr Arg Ser Phs Cys Asn His 
85 90 95 

Asn Val Ser Lau Met Lau Glu Ala Thr Gin Thr Pro Sar Glu Glu Pro 
100 105 110 

Glu Val Asp Ala His Leu Pro Leu lie Lau Gly Pro Val Lau Ala Lau 
115 120 125 

Pro Val Leu Val Ala Leu Gly Ala Leu Cly Lau Trp Arg Val Arg Arg 
130 135 140 

Arg Gin Glu Lys Gin Arg Asp Lau Bis Sar Asp Lau Gly Glu Sar Sar 
145 150 155 160 

Leu He Leu Lys Ala Ser Glu Gin Ala Asp Ser Met Leu Gly Asp Phe 
165 170 175 

Leu Asp Ser Asp Cys Thr Thr Cly Sar Gly Ser Gly Leu Pro Phe Lau 
180 185 190 
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VAX Gin Arg Thr VaI Ma Arg Gin VaI XIa Leu VaI Clu Cyi VaI Gly 

19S 200 205 

Lys Cly Xrg Tyr Oly Clu VaI Trp Arg Gly Ser Trp Hi* Oly Glu Ser 
210 215 220 

VaI AIa VaI Lys Zle Phe Ser Ser Arg Asp Clu Gin 8sr Trp PhA Arg 

225 230 235 240 

Clu Thr Clu lis Tyr Acn Thr VaI Lau Lau Arg His A«p Am ZIa Leu 
245 250 255 

Gly PhA XIa AIa SAr Asp Hat Thr Ser Arg Am SAr 5Ar Thr Gin Leu 

260 265 270 

Trp Leu lit Thr His Tyr Hit Glu Hit Gly SAr Leu Tyr Asp PhA Lau 
275 280 285 

Gin Arg Gin Thr Lau Glu Pro Gin Lau AIa Lau Arg Lau AIa VaI SAr 

290 295 300 

Pro AIa Cys Gly Lau AIa His Leu His VaI Glu 11a PhA Gly Thr Gin 
305 310 315 320 

Gly Lys Pro AIa I1a AIa Kis Arg Asp Lau Lys SAr Arg Asn VaI Leu 

325 330 335 

Val Lys SAr Asn Lau Gin Cys Cys lis AIa Asp Lau Cly Lau AIa VaI 

340 345 350 

Met His Ser Gin Ser Asn Glu Tyr Lau Asp 21a Gly Asn Thr Pro Arg 
355 360 365 

Val Gly Thr Lys Arg Tyr Met AIa Pro Glu VaI Leu Asp Clu His lie 
370 375 380 

Arg Thr Asp Cys Phe Glu SAr Tyr Lys Trp Thr Asp 11a Trp AIa PhA 

385 390 395 400 

Cly Leu Val tou Trp Glu 11a AIa Arg Arg Thr XIa XIa Asn Gly Zle 
405 410 415 

Val Glu Asp Tyr Arg Pro Pro Phe Tyr Asp Met VaI Pro Asn Asp Pro 
420 425 430 

Ser Phe Clu Asp Met Lys Lys VaI Val Cys Val Asp Gin Gin Thr Pro 
435 440 445 

Thr lie Pro Asn Arg Leu AIa AIa Asp Pro VaI Leu Ser Gly Leu AIa 
450 455 460 

Gin Met Met Arg Clu Cys Trp Tyr Pro Asn Pro SAr AIa Arg Leu Thr 
465 470 475 480 

AIa Leu Arg Zle Lys Lys Thr Leu Gin Lys Leu Ser His Asn Pro Glu 
465 490 495 

Lys Pro Lys Val lie His 

500 
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(2) INFORMATION FOR SEQ 10 HO: 13 1 

(i) SEQUENCE CHARACTERISTICS* 

(A) LENGTHS 2070 baae pairs 

(B) TYPXt nucl ic acid 

(C) STRANDEDKESS: unknown 

(D) topology i linear 

(ii) MOLECULE TYPES COHA 

(ill) HYPOTHETICAL: HO 

(ill) ANTI-SENSE s HO 

{▼) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCES 

(A) ORGANISM: Houit 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 217. .1812 



(xi) SEQUENCE DESCRIPTION: SEQ ID HOs 13: 

ATTCATGAGA TGGAAGCATA GGTCAAACCT GTTCGGAGAA ATTGGAACTA CACTTTTATC 60 

TAG CCA CATC TCTGAGAATT CTCAAGAAAG CAGCAGCTCA AACTCATTGC CAAGTGATTT 120 

TGTTCTGTAA GGAAGCCTCC CTCATTCACT TACACCACTC ACACAGCAGC ACCAGTCATT 180 

CAAAGGGCCG TGTACAGCAC GCGTGCCAAT CAGACA ATG ACT CAG CTA TAC ACT 234 

Mat Thr Gin Lau Tyr Thr 
1 5 

TAC ATC AGA TTA CTC CGA GCC TGT CTG TTC ATC ATT TCT CAT CTT CAA 282 
Tyr I la Arg Lau Lau Cly Ala Cya Lau Fha Ila Ila Sar Hia Val Gin 
10 IS 20 



GGG CAG AAT CTA CAT ACT ATG CTC CAT GGC ACT OCT ATG AAA TCA GAC 330 
Gly Gin Aon Leu Asp Ser Met Leu Hia Gly Thr Gly Met Lya Sar Aap 
25 30 35 

TTC GAC CAG AAG AAG CCA GAA AAT GGA GTG ACT TTA CCA CCA GAG GAT 378 
Leu Asp Gin Lya Lya Pro Clu Aan Gly Val Thr Lau Ala Pro Glu Aap 
40 45 50 

ACC TTC CCT TTC TTA AAG TGC TAT TGC TCA GGA CAC TGC CCA CAT GAT *- * 426 ; 
Thr Leu Pro Pha Leu Lya Cya Tyr Cya Sar Gly Hia Cya Pro Aap Aap 
55 60 65 70 

OCT ATT AAT AAC ACA TGC ATA ACT AAT GGC CAT TGC TTT GCC ATT ATA 474 
Ala He Aan Aan Thr Cya Ila Thr Aan Gly Hia Cya Pha Ala Ila Zla 
75 80 85 

GAA CAA CAT CAT CAG GGA GAA ACC ACA TTA ACT TCT GGG TGT ATG AAG 522 
Glu Glu Aap Aap Gin Gly Glu Thr Thr Lau Thr Ser Cly Cya Mat Lya 
90 95 100 
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TAT CXA CCC TCT CAT TTT CAA TCC AAC CAT TCA CCC AAA OCC CAO CTA 570 
Tyr Ciu Oly Sir Asp Phe In Cys Lys Asp Ser Pr Lys Ala Cln Lou 
105 110 115 

CCC AGO ACA ATA CAA TCT TCT CCC ACC AAT TTC TCC AAC CAC TAT TTC €18 
Arg Arg Thr lit Clu Cys Cys Arg Thr Asn Leu Cys A«n Cln Tyr Lou 
120 125 130 

CAC CCT ACA CTC CCC CCT CTT CTT ATA CCT CCC TTC TTT CAT CCC ACC 666 
Cln Pro Thr Leu Pro Pro Val Vol He Cly Pro Phe Phe Asp Cly Ser 
135 140 145 150 

ATC CGA TCC CTC CTT CTC CTC ATT TCC ATC CCT CTC TCT ATA CTT CCT 714 
lie Arg Trp Lou Val Val Lou 21o Ser Met Ala Val Cye Ilo Val Ala 
155 160 165 

ATC ATC ATC TTC TCC ACC TCC TTT TCC TAT AAC CAT TAT TCT AAC ACT 762 
Mot Ilo Zlo Pho Sor Sor Cys Pho Cyo Tyr Lyo His Tyr Cy» Lys Sor 
170 175 180 

ATC TCA ACC ACC CCT CCT TAG AAC CCT CAT TTC CAA CAC CAT CAA CCA 810 
Ilo Ser Ser Arg Cly Arg Tyr Am Arg Asp Lou Clu Cln Asp Clu Ala 
185 190 195 

TTT ATT CCA CTA CCA CAA TCA TTC AAA CAC CTC ATT CAC CAC TCC CAA 858 
Phe Ilo Pro Val Cly Clu Ser Leu Lyo Asp Lou Ilo Asp Cln Ser Cln 
200 205 210 

ACC TCT CCG ACT CCA TCT CCA TTC CCT TTA TTC CTT CAC CCA ACT ATT 906 
Ser Ser Cly Ser Cly Ser Cly Lou Pro Lou Lou Val Cln Arg Thr Ilo 
215 220 225 230 

CCC AAA CAC ATT CAC ATC CTT CCC CAC CTT CCT AAA CCC CCC TAT CCA 954 
Ala Ly» Cln Ilo Cln Met Val Arg Cln Val Cly Lye Cly Arg Tyr Cly 
235 240 245 

CAA CTA TCC ATC CCT AAA TCC CCT CCT CAA AAA CTC CCT CTC AAA CTC 1002 
Clu Val Trp Met Cly Lya Trp Arg Cly Clu Lye Val Ala Val Lye Val 
250 255 260 

TTT TTT ACC ACT CAA CAA CCT ACC TCC TTT ACA CAA ACA CAA ATC TAC 1050 
Phe Phe Thr Thr Clu Clu Ala Ser Trp Pho Arg Clu Thr Clu He Tyr 
265 270 275 

CAC ACC CTC TTA ATC CCT CAT CAA AAT ATA CTT CCT TTT ATA CCT CCA 1098 
Cln Thr Val Leu Met Arg Hie Clu Aen Ilo Lou Cly Pho Ilo Ala Ala 
280 285 290 

CAC ATT AAA CCC ACT CCT TCC TCC ACT CAC CTC TAT TTC ATT ACT CAT 1146 
Asp Ilo Lye Cly Thr Cly Ser Trp Thr Cln Lou Tyr Lou Ilo Thr Aep 
295 300 305 310 

TAC CAT CAA AAT CGA TCT CTC TAT CAC TTC CTC AAA TCT CCC ACA CTA 1194 
Tyr His Clu Asn Cly Ser Leu Tyr Asp Pho Lou Lys Cys Ala Thr Lou 
315 320 325 

CAC ACC ACA CCC CTA CTC AAC TTA CCT TAT TCT CCT CCT TCT CCT CTC 1242 
Aep Thr Arg Ala Leu Leu Lys Lou Ala Tyr Sor Ala Ala Cys Cly Lou 
330 335 340 
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TCC CAC CTC CAC ACA CAA ATT TAT CCT ACC CAA CCC AAC CCT CCA ATT 1290 
Cye His Leu Hit Thr Clu He Tyr Cly Thr Cln Gly Lye Pr Ale Xle 
345 350 355 

CCT CAT CCA CAC CTC AAC ACC AAA AAC ATC CTT ATT AAG AAA AAT CCA 1336 
Ala Hit Arg Asp Leu Lye Ser Lye Aen He Leu He Lye Lye Asn Gly 

360 365 370 

AGT TCC TCT ATT CCT CAC CTC CCC CTA CCT CTT AAA TTC AAC ACT CAT 1386 
Ser Cy« Cye He Ale Aep Leu Cly Leu Ale Vel Lye Phe Asn Ser Aep 

375 380 385 390 



ACA 


AAT 


CAA 


CTT 


CAC 


ATA 


CCC 


TTC 


AAT 


ACC ACC CTC 


CCC 


ACC 


AAC 


CCC 


1434 


Thr 


Asn 


Glu 


Vel 


Asp 


He 


Pro 


Leu 


Aen 


Thr Arg Vel 


Cly 


Thr 


Lye 


Arg 












395 










400 




405 




TAC 


ATC 


CCT 


CCA 


CAA 


CTC 


CTC 


CAT 


CAA 


ACC CTC AAT 


AAA 


AAC 


CAT 


TTC 


1482 


Tyr 


Met 


Ale 


Pro 


Glu 


Vel 


Leu 


Aep 


Glu 


Ser Leu Aen 


Lys 


Asn 


Bie 


Phe 








410 








415 




420 









CAC 
Cln 


CCC 
Pro 


TAC 
Tyr 
425 


ATC 
He 


ATC 
Met 


CCT 
Ale 


CAC 
Asp 


ATC 
He 
430 


TAT 
Tyr 


ACC TTT CCT 
Ser Phe Gly 


TTC 
Leu 
435 


ATC 
He 


ATT 
He 


TCC 
Trp 


1530 


CAA 
Clu 


ATC 
Met 
440 


CCT 
Ale 


CCT 
Arg 


CCT 
Arg 


TCT 
Cye 


ATT 
He 
445 


ACA 

Thr 


CCA 
Cly 


CCA ATC CTC 
Gly He Vel 
450 


CAG 
Clu 


CAA 
clu 


TAT 
Tyr 


CAA 

Gin 


1578 


TTA 
Leu 
455 


CCA 
Pro 


TAT 
Tyr 


TAC 
Tyr 


AAC 
Asn 


ATC 
Met 
460 


CTG 

val 


CCC 
Pro 


AGT 
Ser 


CAC CCA TCC 
Asp Pro Ser 
465 


TAT 
Tyr 


GAG 
Glu 


CAC 
Asp 


ATC 
Met 
470 


1626 


CCT 
Arg 


CAC 
Glu 


CTT 
Val 


CTC 
Vel 


TCT 
Cys 
475 


CTC 
Val 


AAA 
Lye 


CCC 
Arg 


TTC 
Leu 


CCC CCA ATC 
Arg Pro He 
480 


CTC 
Val 


TCT 
Ser 


AAC 
Asn 
485 


CCC 
Arg 


1674 


TCC 
Trp 


AAC 
Asn 


ACC 
Ser 


GAT 
Asp 
490 


CAA 

Glu 


TCT 
Cye 


CTT 
Leu 


CCA 
Arg 


CCA 
Ala 
495 


CTT TTC AAC 
Val Leu Lye 


CTA 
Leu 


ATC 
Met 
500 


TCA 
Ser 


CAA 

Glu 


1722 



TCT TCC CCC CAT AAT CCA CCC TCC ACA CTC ACA CCT TTC ACA ATC AAC 1770 
Cys Trp Ala His Asn Pro Ale Ser Arg Leu Thr Ala Leu Arg He Lye 
505 510 515 

AAC ACA CTT CCA AAA ATC CTT CAA TCC CAC CAT CTA AAC ATT 1812 
Lys Thr Leu Ala Lys Met Val Clu Ser Cln Asp Val Lys He 
520 525 530 

TCACAATTAA ACAATTTTCA CCCACAATTT ACACTCCAAC AACTTCTTCA CCCAAGCAAT 1872 

CCCTCCCATT ACCATCCAAT AGCATCTTCA CTTCCTTTCC AGACTCCTTC CTCTACATCT 1932 

TCACACCCTC CTAACAGTAA ACCTTACCCT ACTCTACACA ATACAAGATT CCAACTTGCA 1992 

ACTTCAAACA TCTCATTCTT TATATATCAC AC CTTT G TT T TAATCTGCCC TTTTTTTCTT 2052 



TCCTTTTTTT GTTTTCTT 
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(2) INFORMATION FOR 5IQ ID HOi 14 i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LSNCTH: S32 amino acid* 

(B) TYPE: amino acid 
<D) TOPOLOGY » lin &r 

(ii) MOLECULE TTPB: prot.in 

(xi) SEQUENCE DESCRIPTION: SBQ 10 NO: 14 j 

Mat Thr Cln L.u Tyr Thr Tyr II. Arg L.u L.u Oly Al. Cy. L.u Ph. 

5 10 15 

II. II. Ser Hi. Val Cin Oly Cln A.n L.u Asp S.r M.t L.„ Hi. Oly 

20 25 30 

Thr Cly Met Ly« S.r A.p Uu A.p Cln Ly. Ly. Pro Clu A.n Cly V.1 

" 40 45 

Ihr L.u Al. Pro Clu A.p Thr L.u Pro Ph. L.u Ly. Cy. Tyr Cy. s.r 

55 60 

Cly Hi. cy. Pro A.p A.p Ala II. A.n A.n Thr Cy. II. Thr A.n Oly 



60 



Hi. Cys Phe Ala II. II. clu Clu A.p A.p Cln Cly Olu Thr Thr L.u 

85 *° 95 

Thr Ser Cly Cy. Met Ly. Tyr Clu Cly S.r A.p Ph. Cln Cy. Ly. A.p 

S.r Pro Ly. Ala Cln Leu Arg Arg Thr II. Clu Cy. Cy. Arg Thr A.n 

I*u cy. A.n Cln Tyr L.u Oln Pro Thr L.u Pro Pro V.1 Val 11. ciy 

135 * 

Pro Phe Phe A.p Cly Ser II. Arg Trp L.u V.1 Val L.u II. s.r M.t 

155 160 
Ala Val Cy. II. v.1 Ala Met II. II. Ph . s . r s . r ^ 

Ly. Hi. Tyr Cy. Ly. S.r II. s.r S.r Arg Cly Arg Tyr A.n Arg A.p 

" w 185 F 

Leu Clu Cln A.p Clu Ala Phe II. Pro Val Cly Clu Ser Leu Ly. A.p 

L.u lie A.p Cln S.r Cln Ser S.r Cly s.r Cly s.r Cly L.u Pro L.u 

Leu Val Cln Arg Thr II. Ala Ly. Cln II. Cln Met Val Arg Cln V.1 

235 240 

Cly Ly. Cly Arg Tyr Cly Clu Val Trp Met Cly Ly. Trp Arg Oly Clu 
" 5 250 2SS 
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Lye Val Ala Val Lya Val Pha Pha Thr Thr Glu Clu Ala Ser Trp Pha 

260 265 270 

Arg Clu Thr Glu He Tyr Gin Thr Val Leu Mat Arg Hia Glu Aan Ila 
275 280 285 

Leu Gly Pha Ila Ala Ala Aap Ila Lya Gly Thr Gly Sar Trp Thr Gin 

290 295 300 

Lau Tyr Lau Ila Thr Aap Tyr Hia Glu Aan Gly Sar Leu Tyr Asp Pha 

305 310 315 320 

Lau Lya Cya Ala Thr Lau Aap Thr Arg Ala Lau Lau Lya Lau Ala Tyr 

325 330 335 

Sar Ala Ala Cya Gly Lau Cye Hia Lau Hia Thr Clu Zla Tyr Gly Thr 
340 345 350 

Gin Gly Lya Pro Ala Zla Ala Hia Arg Asp Lau Lya Ser Lya Aan Ila 

355 360 365 

Lau Ila Lya Lya Aan Gly Sar Cya Cya Ila Ala Aap Lau Gly Lau Ala 

370 375 380 

Val Lya Pha Aan Sar Aap Thr Aan Glu Val Aap Ila Pro Lau Asn Thr 
385 390 395 400 

Arg Val Gly Thr Lya Arg Tyr Met Ala Pro Glu Val Lau Aap Clu Sar 
405 410 415 

Lau Aan Lya Aan Hia Pha Gin Pro Tyr Ila Mat Ma Aap Ila Tyr Sar 
420 425 430 

Pha Gly Leu Ila Ila Trp Clu Met Ala Arg Arg Cya Ila Thr Gly Gly 
435 440 445 

Ila Val Clu Clu Tyr Cln Lau Pro Tyr Tyr Aan Mat Val Pro Sar Aap 
450 455 460 

Pro Ser Tyr Glu Aap Mat Arg Clu Val Val Cy» Val Lya Arg Lau Arg 
465 470 475 480 

Pro Ila Val Ser Aan Arg Trp Aan Ser Asp Glu Cya Lau Arg Ala Val 
485 490 495 

Lau Lys Lau Mat Sar Glu Cya Trp Ala His Asn Pro Ala Sar Arg Lau 

500 505 510 

Thr Ala Leu Arg Ila Lya Lya Thr Leu Ala Lya Met Val Glu Ser Gin 
515 520 525 

Aap Val Lya Ila 



530 



INFORMATION 



SEQ 



ZD HO* 15 x 



(i) SEQUENCE CHARACTERISTICS i 

(A) LENGTH: 2160 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDKES3: unknown 

(D) TOPOLOCTi X Ins ix 

<ii) MOLECULE TTPE: CDNA 

(iii) HYPOTHETICAL: HO 

(iii) ANTI-SENSE: NO 

(v) FRAGMENT TTPE: internal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mouae 

(ix) TEATURE: 

(A) make/ICRs cds 

(B) LOCATION: 10, .1524 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CCCGGTTAC ATG CCG GAG TCG GCC GGA CCC TCC TCC TTC TTC CCC CTT 48 
Mat Ala Glu Sir Ala Gly Ala Sar Sar Pha Pha Pro Lau 
1 5 10 

GTT GTC CTC CTC CTC GCC GGC AGC GCC GGG TCC GGG CCC CGG GGG ATC 96 
Val Val Leu Lau Lau Ala Gly Sar Gly Gly Sar Gly Pro Arg Gly lit 
15 20 25 

CAG CCT CTG CTC TCT GCC TGC ACC AGC TCC CTA CAC ACC AAC TAG ACC 144 
Gin Ala Lau Lau Cya Ala Cya Thr Sar Cya Lau Gin Thr Aan Tyr Thr 
30 35 40 45 

TCT CAG ACA GAT GGG CCT TGC ATG CTC TCC ATC TTT AAC CTG CAT CCC 192 
Cya Glu Thr Aap Gly Ala Cya Met Val Sar lie Pha Aan Lau Aap Gly 
50 55 60 

CTG CAC CAC CAT CTA CCT ACC TCC ATC CCC AAG CTC GAG CTG CTT CCT 240 
Val Glu Hia Hia Val Arg Thr Cya Zla Pro Lya Val Clu Lau Val Pro 
65 70 75 

OCT CCA AAG CCC TTC TAC TGC CTC AGT TCA CAG CAT CTG CCC AAC ACA 288 
Ala Gly Lya Pro Pha Tyr Cya Lau Sar Sar Glu Aap Lau Arg Aan Thr 
80 65 90 

CAC TGC TGC TAT ATT GAC TTC TCC AAC AAG ATT GAC CTC AGC CTC CCC 336 
Hia Cya Cya Tyr lie Aap Pha Cya Aan Lya Zla Aap Lau Arg Val Pro 

95 100 . 105 

AGC CCA CAC CTC AAG CAG CCT CCC CAC CCC TCC ATC TCC GGC CCT CTC 384 
Ser Gly His Lau Lya Glu Pro Ala Hia Pro Sar Mat Trp Gly Pro Val 
110 115 120 125 

GAG CTC CTC GGC ATC ATC GCC GGC CCC CTC TTC CTC CTC TTC CTT ATC 432 
Glu Leu Val Gly Zla lie Ala Gly Pro Val Pha Lau Lau Pha Lau Zla 
130 135 140 
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ATT XTC ATC CTC TTC CTC CTC XTC AAC TAT CAC CAC CCT CTC TAG CAT 480 
II* He XI* Val Phe Leu Val He Aan Tyr Hia Gin Arg Val Tyr His 
14S 150 155 

AAC CCC CAC AGC TTC CAC ATC CAC CAC CCC TCT TCC CAC ATC TCT CTC 526 • 

Aan Arg Cln Arg Leu Asp Met Glu Aap Pro Ser Cya Glu Met Cya Leu 
160 165 170 

TCC AAA CAC AAG ACC CTC CAC CAT CTC CTC TAC CAC CTC TCC ACC TCA 576 
Ser Lye Asp Lya Thr Leu Gin Asp Leu Val Tyr Atp Leu Ser Thr Ser 
175 180 185 

CCC TCT CCC TCA CCC TTA CCC CTT TTT CTC CAC CCC ACA CTC CCC CCA 624 
Gly Ser Gly Sor Gly Leu Pro Leu Phe Val Gin Arg Thr Val Ala Arg 
190 195 200 205 

ACC ATT CTT TTA CAA CAC ATT ATC CCC AAC GGC CCC TTC GGC GAA CTA 672 
Thr J la Val Lau Cln Glu Ila Zla Gly Lya Gly Arg Pha Gly Glu Val 
210 215 220 

TCC CCT CGT CCC TCC ACC GCT CCT CAC CTC CCT CTC AAA ATC TTC TCT 720 
Trp Arg Gly Arg Trp Arg Gly Gly Aap Val Ala Val Lya Ila Pha Sar 
225 230 235 

TCT CCT CAA CAA CCC TCT TCC TTC CCT CAA CCA GAG ATC TAC CAC ACC 768 
Ser Arg Glu Glu Arg Sar Trp Pha Arg Glu Ala Glu Ila Tyr Cln Thr 
240 245 250 

CTC ATC CTC CCC CAT CAA AAC ATC CTT CCC TTT ATT CCT CCT CAC AAT 816 
Val Met Leu Arg Hia Glu Asn Ila Lau Gly Pha Ila Ala Ala Aap Aan 
255 260 265 

AAA CAT AAT CCC ACC TCC ACC CAC CTC TCC CTT CTC TCT CAC TAT CAC 864 
Lya Asp Aan Cly Thr Trp Thr Gin Leu Trp Leu Val Sar Aap Tyr Hie 
270 275 280 285 

CAC CAT CCC TCA CTG TTT CAT TAT CTC AAC CCC TAC ACA CTC ACC ATT 912 
Glu Hia Cly Ser Leu Phe Aap Tyr Lau Aan Arg Tyr Thr Val Thr He 
290 295 300 

CAC GGA ATC ATT AAC CTA CCC TTC TCT CCA CCC ACT CCT TTC CCA CAC 960 
Clu Cly Met He Lya Leu Ala Leu Ser Ala Ala Ser Cly Leu Ala Hia 
305 310 315 

CTC CAT ATC CAC ATT CTC CCC ACT CAA CCC AAC CCC CCA ATT CCT CAT 1008 
Leu Hia Met Clu He Val Gly Thr Gin Gly Lya Pro Gly He Ala Hia 
320 325 330 

CCA CAC TTC AAC TCA AAC AAC ATC CTC CTG AAA AAA AAT CCC ATC TCT 10S6 
Arg Aap Leu Lya Ser Lya Aan He Leu Val Lye Lya Aan Cly Hat Cya 
335 340 345 

CCC ATT CCA CAC CTC CCC CTC GCT CTC CCT CAT GAT CCC CTC ACT CAC 1104 
Ala He Ala Aap Leu Gly Leu Ala Val Arg Hia Aap Ala Val Thr Aap 
350 355 360 365 



ACC ATA GAC ATT CCT CCA AAT CAC ACC CTG GGG ACC AAA CCA TAC ATC 
Thr He Aap Ha Ala Pro Aan Gin Arg Val Gly Thr Lya Arg Tyr Met 
370 375 380 
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OCT CCT CAA CTC CTT CAC GAG ACA ATC AAC ATG AAG CAC TTT GAC TOO 1200 
Ala Pro Glu Val Lau Aap Glu Thr Zla Aan Hat Lya Hit Pha Atp Sar 
285 390 395 

TTC AAA TGT GCC CAC ATC TAT OCC CTC GGG CTT OTC TAC TOO GAG ATT 1248 
Pha Lya Cya Ala Aap Zla Tyr Ala Lau Gly Leu Val Tyr Trp Glu Ila 
400 405 410 

OCA CCA ACA TGC AAT TCT GGA GGA CTC CAT CAA CAC TAT CAA CTC CCG 1296 
Ala Arg Arg Cya Aan Sar Gly Gly Val Bia Glu Aap Tyr Gin Lau Pro 
415 420 425 

TAT TAC CAC TTA CTC CCC TCC CAC CCT TCC ATT GAG GAC ATC CCA AAC 1344 
Tyr Tyr Aap Lau Val Pro Sar Aap Pro Sar Zla Glu Glu Hat Arg Lya 
430 435 440 445 

CTT CTA TCT GAC CAC AAG CXA CCG CCC AAT CTC CCC AAC TCC TGC CAC 1392 
Val Val Cya Aap Gin Lya Lau Arg Pro Aan Val Pro Aan Trp Trp Gin 
450 455 460 

ACT TAT CAC CCC TTC CCA CTC ATC CCA AAC ATC ATC CCC CAC TCC TCC 1440 
Sar Tyr Glu Ala Lau Arg Val Mat Gly Lya Met Mat Arg Glu Cya Trp 
465 470 475 

TAC CCC AAT CCT CCT CCC CCT CTC ACA CCT CTC CCC ATC AAC AAC ACT 1488 
Tyr Ala Aan Gly Ala Ala Arg Lau Thr Ala Lau Arg Zla Lya Lya Thr 
480 485 490 

CTC TCC CAC CTA ACC CTC CAC CAA CAT CTC AAC ATT TAACCTCTTC 1534 
Lau Sar Gin Lau Sar Val Gin Glu Aap Val Lya Zla 
495 500 505 



CTCTCCCTAC 


ACAAAGAACC 


TCCCCACTCA 


GCATCACTCC ACCCACCCTC 


CAACCGTCCT 


1594 


CCACGCCTAT 


CCTCZ rGTTT 


CTCCCCCCCC 


CTCTCCCACA CCCCTGGCCT 


CCAACAGCCA 


1654 


CACACCCTCC 


CACACCOCCC 


CACTCCCCTT 


CCCTTTCAGA CACACACTTT 


TTATATTTAC 


1714 


CTCCTCATCG 


CATCCACACC 


TCACCAAATC 


ATGTACTCAC TCAATCCCAC 


AACTCAAACT 


1774 


CCTTCACTCC 


CAACTACACA 


CACCCAGTCC 


ATTCCCTCTC CACCACCCTG 


ACCTCCTCGC 


1634 


CTCCCCACCA 


CCCCCCCCCA 


TACCTTCTCC 


TCCACTGCGC TCCACCTTTT 


CCTCCACCCA 


1894 


CCAGTCAACT 


GGCATCAACA 


TATTCAGACG 


AACCCCAACT TTCTCCCTCC 


TTCCCCTACC 


1954 


ACTCCTCACC 


CACACCATCC 


TTCTCATCCA 


CATCCCCACC ACTCCCCCTA 


GACACACAAC 


2014 


CTGCTCCCTC 


TCTGTCCACC 


CAAGTGCCCA 


TCTCCCCACC TGTGTCCCAC 


ATTCTCCCTC 


2074 


CTCTGTGCCA 


CCCCCGTCTG 


TCTCTCTCTG 


TCTCTGACTC ACTCTCTCTC 


TCTACACTTA 


2134 


ACCTGCTTGA 


CCTTCTCTCC 


ATG TCT 






2160 



(2) ZNFORKATZON FOR SZQ ZD HO J 16 s 



(i) SEQUENCE CHARACTEKZSTZCSl 
(A) LENGTH: 505 amino acida 
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(8) TTPKs amino acid 
(D) TOPOLOCTi linear 

<ii) M0LXCUL2 TTPE: protein 

(xi) SEQOZNCX DESCRIPTION t 5EQ ID NOt 16 1 

Mat Ala Clu Sar Ala Cly Ala Sar Bar Phe Pha Pro Leu Val Val Leu 

1 . S 10 is 

Leu Leu Ala Cly Sar cly Cly Sar Cly Pro Arg Cly He Cln Ala Leu 

20 25 30 

Lou Cya Ala Cya Thr Sar Cya Leu Cln Thr Aan Tyr Thr Cya Clu Thr 
3S 40 45 

Aap Gly Ala Cya Net Val Sar Ha Pha Aan Leu Aap Cly Val Clu Hi a 

50 55 60 

Hie Val Arg Thr Cya Ha Pro Lya Val Clu Leu Val Pro Ala Cly Lya 

65 70 75 80 

Pro Pha Tyr Cya Leu Sar Sar Clu Aap Leu Arg Aan Thr Hia Cya Cya 

85 90 95 

Tyr Ha Asp Pha Cya Aan Lya Ha Aap Leu Arg Val Pro Sar Cly Hia 

100 105 no 

Leu Lya Clu Pro Ala His Pro Ser Met Trp Cly Pro Val Clu Leu Val 
115 120 125 

Cly Ha Ha Ala Cly Pro Val Pha Leu Leu Phe Leu He He He He 

130 135 140 

Val Phe Leu Val He Aan Tyr Hia Cln Arg Val Tyr Hia Aan Arg Cln 
1<5 150 155 160 

Arg Leu Aap Mat Clu Aap Pro Sar Cya Clu Met Cya Leu Ser Lya Aap 
165 170 175 

Lya Thr Leu Cln Aap Lau Val Tyr Aap Leu Ser Thr Ser Cly Ser Cly 
180 185 190 

Ser Cly Leu Pro Leu Pha Val Cln Arg Thr Val Ala Arg Thr He Val 
195 200 205 

Leu Cln Clu Ha He Cly Lya Cly Arg Phe Cly Clu Val Trp Ara -Civ 
210 215 220 

Arg Trp Arg Cly Cly Aap Val Ala Val Lya He Phe Ser Ser Arg Clu 
225 230 235 * 240 

Clu Arg Ser Trp Phe Arg Clu Ala Clu He Tyr Cln Thr Val Met Leu 
245 250 255 

Arg Hia Clu Aan He Leu Cly Phe He Ala Ala Aap Aan Lya Aap Aan 

260 265 270 
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Cly Thr Trp Thr Gin Lsu Trp Lsu Val Ser Asp Tyr His Olu Bis ly 

275 280 285 

Ssr Leu Phs Asp Tyr Lsu Asn Arg Tyr Thr Val Thr 21s Clu Cly Met 
290 295 300 

lis Lys Lsu Ala Lsu Ssr Ala Ala Ssr Cly Lsu Ala His Lsu His Mst 

305 310 315 320 

Clu lis Val Cly Thr Gin Cly Lys Pro Cly lis Ala His Arg Asp Leu 

325 330 335 

Lys Ser Lys Asn Zls Lsu Val Lys Lys Asn Cly Mst Cys Ala Zls Ala 

340 345 350 

Asp Lsu Cly Lsu Ala Val Arg His Asp Ala Val Thr Asp Thr lis Asp 

355 360 365 

lis Ala Pro Asn Cln Arg Val Cly Thr Lys Arg Tyr Mst Ala Pro Clu 

370 375 380 

Val Leu Asp Glu Thr lis Asn Met Lys His Phs Asp Ser Phs Lys Cys 
385 390 395 400 

Ala Asp lis Tyr Ala Leu Cly Lsu Val Tyr Trp Clu Zls Ala Arg Arg 
405 410 415 

Cys Asn Ser Cly Cly Val His Clu Asp Tyr Cln Leu Pro Tyr Tyr Asp 
420 425 . 430 

Leu Val Pro Ser Asp Pro Ser lie Clu Clu Met Arg Lys Val Val Cys 
435 440 44S 

Asp cln Lys Leu Arg Pro Asn Val Pro Asn Trp Trp Cln Ser Tyr Clu 
450 455 460 

Ala Leu Arg Val Met Cly Lys Met Met Arg Clu Cys Trp Tyr Ala Asn 
465 470 475 480 

Cly Ala Ala Arg Leu Thr Ala Leu Arg lie Lys Lys Thr Leu Ser Cln 
485 490 495 

Leu Ser Val Cln Clu Asp Val Lys Zls 

500 505 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1952 bass pairs 

(B) TYPE: nuclsie acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 



run rrin i^t- riir t""r 



r ' (~ ' 

WO 94/1 1502 \ PCT/GB93/02367 

75 

(v) rRACKTKT TTPf : Internal 

(vl) ORIGINAL SOTOCX: 

(A) RGAHXSKi Mouse 

(ix) PEATUREi 

(A) NAME fTZX x COS 

(8) LOCATION: 187. .1692 

(xi) SEQUENCE DESCRIPTION t SSQ ID KOt 17 % 

AAGCGCCGGC AGAAGTTGCC GGOGTGGTGC TCGTAGTGAG CGCGCGGAGG ACCCGGGACC 60 

TGGGAAGCGG CGGCGGGTTA ACTTCGGCTG AATCACAACC ATTTGGCGCT GAGCTATGAC 120 

AAGAGAGCAA ACAAAAAGTT AAAGGAGCAA CCCGGCCATA AGTCAAGAGA GAACTTTATT 180 

GATAAC ATC CTC TTA CCA AGC TCT CCA AAA TTA AAT GTG GGC ACC AAG 228 
Ket Leu Leu Arg Ssr Ser Gly Lys Leu Asn Val Gly Thr Lys 
1 5 10 

AAG GAG GAT GGA GAG AGT ACA GCC CCC ACC CCT CGG CCC AAG ATC CTA 276 
Lys Glu Asp Gly Glu Ser Thr Ala Pro Thr Pro Arg Pro Lys lit Leu 
15 20 25 30 

CGT TCT AAA TGC CAC CAC CAC TCT CCG CAA CAC TCA GTC AAC AAT ATC 324 
Arg Cya Lys Cys His His His Cys Pro Glu Asp Ser V«\l Asn Asn lie 
35 40 45 

TGC AGC ACA GAT GGG TAC TGC TTC ACG ATG ATA GAA GAA GAT CAC TCT 372 
Cys Ser Thr Aap Gly Tyr Cys Phe Thr Met lie Glu Glu Asp Asp Ser 
50 55 60 

GGA ATG CCT CTT CTC ACC TCT GGA TGT CTA GGA CTA CAA GGG TCA GAT 420 
Gly Met Pro V&l Val Thr Ser Gly Cys Leu Gly Leu Glu Gly Ser Asp 
65 70 75 

TTT CAA TGT CCT CAC ACT CCC ATT CCT CAT CAA AGA AGA TCA ATT CAA 468 
Phe Gin Cys Arg Asp Thr Pro He Pro His Gin Arg Arg Ser He Glu 
80 85 90 

TGC TCC ACA GAA AGO AAT GAG TGT AAT AAA CAC CTC CAC CCC ACT CTC 516 
Cys Cys Thr Glu Arg Asn Glu Cys Asn Lys Asp Leu His Pro Thr Leu 
95 100 105 110 

CCT CCT CTC AAG GAC AGA CAT TTT GTT GAT GGG CCC ATA CAC CAC AAG 564 
Pro Pro Leu Lys Asp Arg Asp Phe Val Asp Gly Pro He His His Lys . . 
115 120 125 

CCC TTG CTT ATC TCT GTG ACT GTC TGT ACT TTA CTC TTC CTC CTC ATT 612 
Ala Leu Leu He Ser Val Thr Val Cys Ser Leu Leu Leu Val Leu He 
130 135 140 



ATT TTA TTC TCT TAC TTC AGG TAT AAA AGA CAA CAA CCC CCA CCT CCG 
He Leu Phe Cys Tyr Phe Arg Tyr Lys Arg Gin Clu Ala Arg Pro Arg 
145 150 155 
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TAC AGC ATT COG CTC GAG CAO CAC GAG ACA TAC ATT CCT CCI CGA GAG 708 
Tyr Ser llm Gly Leu Ciu Gin Asp Clu Thr Tyr llm Pr Pr Gly Clu 
160 16S 170 

TCC CTC ACA CAC TTC ATC CAG CAC TCT CAC ACC TCC CGA ACT CCA TCA 756 
Smr Leu Arg Asp Leu llm Clu Cln Ser Gin Ser Sor Gly Smr Gly Bmr 
175 180 185 190 



GGC CTC CCT CTC CTG CTC CAA ACC ACA ATA CCT AAG CAA ATT CAG ATC 804 
Gly Leu Pro Leu Leu Vel Gin Arg Thr He Ale Lye Gin He Gin Met 
195 200 ^ 205 

CTC AAC CAC ATT CCA AAA CCC CCC TAT CCC CAG CTC TCC ATC CCA AAG 652 
Vel Lys Gin He Gly Lys Gly Arg Tyr Gly Clu Vel Trp Met Gly Lye 
210 215 220 

TCC CCT CCA CAA AAG CTG CCT CTC AAA CTG TTC TTC ACC ACC CAG GAA 900 
Trp Arg Gly Clu Lys Vel Ale Vel Lys Vel Phe Phe Thr Thr Glu Glu 
225 230 235 

CCC ACC TCC TTC CGA CAG ACT CAC ATA TAT CAC ACC CTC CTG ATC CCC 948 
Ale Ser Trp Phe Arg Clu Thr Clu He Tyr Cln Thr Vel Leu Met Arg 
240 245 250 

CAT GAG AAT ATT CTC GOG TTC ATT CCT CCA GAT ATC AAA GGG ACT CCC 996 
His Clu Asn He Leu Gly Phe He Ale Ale Asp He Lys Gly Thr Gly 
255 260 265 270 

TCC TCC ACT CAC TTC TAC CTC ATC ACA CAC TAT CAT CAA AAC CCC TCC 1044 
Ser Trp Thr Gin Leu Tyr Leu llm Thr Asp Tyr Hie Glu Asn Gly Ser 
275 280 285 

CTT TAT CAC TAT CTC AAA TCC ACC ACC TTA CAC CCA AAG TCC ATC CTC 1092 
Leu Tyr Asp Tyr Leu Lys Ser Thr Thr Leu Asp Ale Lys Ser Met Leu 
290 295 300 

AAC CTA CCC TAC TCC TCT CTC ACC CCC CTA TCC CAT TTA CAC ACC GAA 1140 
Lys Leu Ale Tyr Ser Ser Vel Ser Gly Leu Cye His Leu His Thr Glu 
305 310 * 315 

ATC TTT ACC ACT CAA CCC AAC CCA CCA ATC GCC CAT CCA GAC TTC AAA 1188 
lie Phe Ser Thr Gin Gly Lys Pro Ale llm Ale Bie Arg Asp Leu Lys 
320 325 330 

ACT AAA AAC ATC CTC GTC AAC AAA AAT CCA ACT TCC TCC ATA CCA CAC 1236 
Ser Lys Asn llm Leu Vel Lye Lys Asn Gly Thr Cys Cys llm Ale Aep 
335 340 345 350 

CTC GCC TTC CCT GTC AAG TTC ATT ACT GAC ACA AAT CAG GTT GAC ATC 1284 
Leu Gly Leu Ale Vel Lys Phe llm Ser Asp Thr Asn Glu Vel Asp He 
355 360 365 

CCA CCC AAC ACC CCG CTT CCC ACC AAC CCC TAT ATC CCT CCA GAA CTC 1332 
Pro Pro Asn Thr Arg Vel Gly Thr Lys Arg Tyr Met Pro Pro Glu Vel 
370 375 380 



CTG CAC CAG ACC TTC AAT ACA AAC CAT TTC CAC TCC TAC ATT ATC CCT 
Leu Asp Glu Ser Leu Asn Arg Asn His Phe Gin Ser Tyr llm Met Ale 
385 390 395 
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CAC XTC TAC AGC TTT CCA CTC ATC CTC TCC CAO ATT CCA AGO ACA TCT 1428 
Aep Met Tyr Ser Ph Cly Leu lie liu Trp Glu lie Ala Arg Arg Cye 
400 405 410 

CTT TCT CCA CCT ATA CTC CAA CAA TAC CAC CTT CCC TAT CAC CAC CTC 1476 
Val Ser Gly Cly !!• Val Clu Clu Tyr Gin Leu Pro Tyr His Aep Leu 
415 420 425 430 

CTC CCC ACT CAC CCT TCT TAT CAC CAC ATC AGA CAA ATT CTC TCC ATC 1524 
Val Pro Ser Asp Pro Ser Tyr Glu Aap Hmt Arg Clu ZU Val Cya Hmt 
435 440 445 

AAC AAC TTA CCG CCT TCA TTC CCC AAT CCA TCC ACC AGT CAT CAG TCT 1572 
Lye Lya Leu Arg Pro Smr Phe Pro Aan Arg Trp Ser Smr Atp Clu Cye 
450 455 460 

CTC ACC CAG ATC CCC AAC CTT ATC ACA GAG TCC TCC CCC CAC AAT CCT 1620 
Leu Arg Gin Met Gly Lye Leu Met Thr Glu Cye Trp Ala Gin Aan Pro 
465 470 475 

CCC TCC ACC CTC ACC CCC CTC ACA GTT AAC AAA ACC CTT GCC AAA ATC 1668 
Ala Ser Arg Leu Thr Ala Leu Arg Val Lya Lye Thr Leu Ala Lya Met 
480 485 490 

TCA CAG TCC CAG CAC ATT AAA CTC TCACCTCACA TACTTCTCCA CAGAGCAAGA 1722 
Ser Glu Ser Gin Aap He Lya Leu 
495 500 

ATTTCACACA AGCATCCTTA CCCCAAGCCT TCAACCTTAC CCTACTGCCC AGTCAGTTCA 1782 

CACTTTCCTC GAAGAGACCA CCCTCCCCAG ACACAGAGGA ACCCAGAAAC ACCCATTCAT 1842 

CATCGCTTTC TCAGCAGGAG AAACTCTTTC GGTAACTTCT TCAAGATATG ATCCATCTTG 1902 

CTTTCTAAGA AAGCCCTCTA TTTTCAATTA CC ATTTTTTT ATAAAAAAAA 1952 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 502 amino acide 

(B) TYPE: amino acid 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



Met Leu Leu Arg Ser Ser Cly Lye Leu Aan Val Cly Thr Lya Lya Clu 
1 5 10 15 

Aap Cly Clu Ser Thr Ala Pro Thr Pro Arg Pro Lya He Leu Aro Cye 
20 25 30 

Lya Cye Hie Hie Hie Cye Pro Clu Aap Ser Val Aan Aan He Cya Ser 
35 40 45 



Thr Aep Gly Tyr Cye Phe Thr Met He Clu Clu Aap Aap Ser Cly Met 
50 55 60 
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Pro Val Val Thr Ser Cly cys Leu Gly Leu Clu Cly ser Asp Phs Gin 

65 70 75 80 

Cyi Arg Atp Thr Pr lit Pro Hit Gin Arg Arg Ssr lit Glu Cys Cys 
85 90 95 

Thr Glu Arg Aen Glu Cys Atn Ly« Asp Leu His Pro Thr Leu Pro Pro 
100 105 110 

Leu Lys Asp Arg Asp Phs Val Asp Gly Pro Zle His His tys Alt Leu 

115 120 125 

Leu lie Ser Val Thr Val Cys Ser Leu Leu Leu Val Leu Zls lis Leu 

130 135 140 

Phs Cys Tyr Phe Arg Tyr Lys Arg Gin Glu Ala Arg Pro Arg Tyr Ssr 
145 ISO 155 150 

lis Gly Leu Glu Gin Asp Glu Thr Tyr Zls Pro Pro Gly Glu Ssr Leu 
165 170 175 

Arg Asp Leu Zls Glu Gin Ser Gin Ser Ser Gly Ser Gly Ser Gly Leu 



- Pro Leu Leu Val Gin Arg Thr Zle Ale Lys Gin Zle Gin Met Val Lys 

195 200 205 

Gin lie Gly Lys Gly Arg Tyr Cly Glu Val Trp Met Gly Lys Trp Arg 
210 215 220 

Gly Glu Lys Val Ale Val Lys Vel Phs Phs Thr Thr Glu Glu Ala Ser 
225 230 235 240 

Trp Phe Arg Glu Thr Glu Zle Tyr Gin Thr Val Leu Met Arg Bis Glu 
245 250 255 

Asn He Leu Gly Phe Zle Ala Ala Asp Zle Lys Gly Thr Gly Ser Trp 

260 265 270 

Thr Gin Leu Tyr Leu Zle Thr Asp Tyr His Glu Asn Cly Ssr Leu Tyr 
275 280 285 

Asp Tyr Leu Lys Ser Thr Thr Leu Asp Ala Lys Ser Met Leu Lys Leu 

290 295 300 

Ala Tyr Ser Ser Val Ser Cly Leu Cys His Leu His Thr Glu Zle Phe 
305 310 315 320 

Ser Thr Cln Gly Lys Pro Ala Zle Ala His Arg Asp Leu Lys Ser Lys 
325 330 335 

Asn Zle Leu Val Lys Lys Asn Gly Thr Cys Cys Zle Ala Asp Leu Gly 
340 345 350 

Leu Ala Val Lys Phe Zle Ser Asp Thr Asn Glu Val Asp Zle Pro Pro 

355 360 365 

Asn Thr Arg Val Cly Thr Lys Arg Tyr Met Pro Pro Glu Val Leu Asp 



180 



185 



190 



370 



375 



380 
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Clu Ser Leu Am Arg Atn Hie Fhe Gin Ser Tyr He Met Ale Asp Met 

385 390 395 400 

Tyr Ser Phe Gly Leu He Leu Trp Clu Zle Ale Arg Arg Cye Val Ser 
405 410 415 

Gly Gly He Val Clu Clu Tyr Gin Leu Pro Tyr Hie Aep Leu Val Pro 
420 425 430 

Ser Aep Pro Ser Tyr Glu Aep Met Arg Clu Zle Vel Cye Met Lye Lye 
435 440 445 

Leu Arg Pro Ser Phe Pro Atn Arg Trp Ser Ser Aep Clu Cye Leu Arg 
450 455 460 

Gin Met Gly Lye Leu Mot Thr Glu Cye Trp Ale Cln Aen Pro Ale Ser 
465 470 475 480 

Arg Leu Thr Ale Leu Arg Vel Lye Lye Thr Leu Ale Lye Met Ser Clu 
485 490 495 

Ser Gin Asp He Lye Leu 

500 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base paire 
(8) TYPE: nucleic ecid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDKA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 19: 
CCGGATCCTC TTGTCAAGGN AATATCTC 28 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 bate paire 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: eingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDKA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION l SEQ ID NO S 20s 
GCGATCCGTC GCAGTCAAAA TTTT 24 

(2) INFORMATION FOR SEQ ID NOt 21 X 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 bass pairs 

(B) TYPE: nuclsic acid 

(C) STRAND EDNESS : singls 

(D) TOPOLOGY s linsar 

(ii) MOLECULE TYPE: eDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
GCGCATCCCC CATATATTAA AAGCAA 26 



(2) INFORMATION FOR SEQ ID NO: 22: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 bast pairs 

(B) TYPE: nuclsic acid 

(C) STRANDEDNESS: tingle 

(D) TOPOLOGY: linsar 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CGGAATTCTG CTCCCATATA 20 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 bats pairt 

(B) TYPE: nuclsic acid 

(C) STRANDEDNESS : singls 

(D) TOPOLOGY: 1 in tar 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
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(ill) ANTI-SENSE: MO 

(xi) SEQUENCE DESCRIPTION l SEQ ZD NOs 23 s 
ATTCAAGGCC ACATCAACTT CATTTGTGTC ACTGTTC 37 



(2) INFORMATION FOR SEQ ID NOs 24 1 

(i) SEQUENCE CHARACTERISTICS} 

(A) LENGTH: 26 baaa pairs 

(B) TYPE: nuclaic acid 

(C) STRAND ED NESS 2 ainqla 

(D) TOPOLOGY 2 linaar 

(ii) MOLECULE TYPE 2 cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 2 24: 
GCGCATCCAC CATGGCGGAG TCGGCC 26 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 basa paira 

(B) TYPE: nuelaic acid 

(C) STRANDEDNESS 2 aingla 

(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE 2 cDNA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
AACACCGGCC CGGCGATGAT 2Q 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 aaino acids 

(B) TYPE: aaino acid 
(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: peptida 
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(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ 2D HOf 26 1 

Gly Xaa Cly Xaa 2u GXy 

1 S 



(2) INFORMATION FOR SEQ ID NO: 27 1 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(V) TOPOLOGY: iinaar 

(ii) MOLECULE TYPE: paptida 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Asp Pha Lys Sar Arg Asn 

1 5 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: iinaar 

(ii) MOLECULE TYPE: psptida 



(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 28: 

Asp Leu Lys Sar Lys Asn 

1 5 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: iinaar 

(ii) MOLECULE TYPE: paptida 



(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 29: 

Gly Thr Lys Arg Tyr Mat 
1 5 
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